《程序设计方法学》知识整理
上了永平奖获得者翁恺老师《程序设计方法学》这门课。这是一门很有趣的课,介绍了程序设计语言的历史和发展脉络。可惜这是一节早课,我经常在意识模糊中听课(甚至是没有成功起床)。
这门课有一半的时间是同学展示。我倒是更希望翁老师能多留出时间自己讲。
最终的期末考试内容为:BNF,数据类型和运算符,程序控制结构,流计算,递归,并行计算,函数的调用和返回,Actor,异步执行。我对他们做了一个简单的整理。
三种编程语言
- imperative language 命令式程序设计语言
- Variables, assignment statements, and iteration
- Include languages that support object-oriented programming, scripting languages, visual languages
- Ex.: C, Java, Perl, JavaScript, Visual BASIC .NET
- Functional Language
- Computing by applying functions to given parameters
- Ex.: LISP, Scheme, ML
- Logic Language (Highly inefficient)
- Rule-based (rules are specified in no particular order)
- Ex.: Prolog
历史
- FORTRAN 1950s-1960s - 第一门高级语言
- Pascal:第一门结构化语言
- ALGOL 1960s:
- Structured programming, free format lexical
- Top-down design and step-wise refinement
- LISP: 1958 - beginning of functional programming
- Only two data types: atoms and lists
- Computations by applying functions to parameters
- No concept of variables (storage) or assignment
- Control via recursion and conditional expressions
- BASIC: 1963 - beginning of Timesharing language
- BASIC: Beginner’s All-purpose Symbolic Instruction Code
- BASIC () " Kemeny & Kurtz at Dartmouth, 1963
- keyword: interactive
- SIMULA: 1961- beginning of Data Abstraction
- SmallTalk:1972 - beginning of Object-Oriented Programming
- Language evaluation criteria
- Readability
- Writability
- Reliability
- Cost
- The major methods of implementing programming languages are: compilation, pure interpretation, and hybrid implementation.
BNF(Backus-Naur Form)
- Syntax: 语法 Semantics: 语义
- BNF is a metalanguage–a language used to describe another language
- BNF 概念
+ lexeme 语素
+ token 标记 - BNF 结构
- Any word or words enclosed in angle brackets, < >, is a nonterminal – it stands for some string (it’s like a string variable)
- The symbol ::= or -> means “is defined as”
- The symbol | means “or”
- Anything else is a terminal – it stands for itself
- 缺点
- BNF isn’t great for setting limits on lengths
- BNF cannot do numeric comparisons
- BNF can only describe local syntax; it cannot refer to another part of the program
- EBNF:引入正则表达式来简化BNF
Binding 绑定
- Imperative languages are abstractions of von Neumann architecture.
- Variables memory cells
- Expressions CPU executions
- Variable Attributes
- Name
- Value
- Type
- lifetime
- scope
- Binding: an association between a variable and its storage or its type.
- A binding is static if it first occurs before run time and remains unchanged throughout program execution.
- A binding is dynamic if it first occurs during execution or can change during execution.
- Variables:
- Static Variables
- Stack-dynamic Variables
- Explicit Heap-dynamic Variab: All objects in Java.
- Implicit Heap-dynamic Variables: Allocation and deallocation caused by assignment statements, regardless of what the variable was previously used for.
- Scope
- Static Scope: Scope of a variable can be statically determined.
- Variables can be hidden from a unit by having a “closer” variable with the same name.
- Dynamic Scope:
- Based on calling sequences of program units, not their textual layout.(在命名空间里找变量时并不是沿着 static chain跳的,而是一直往caller那儿跳)
- References to variables are connected to declarations by searching back through the chain of subprogram calls that forced execution to this point
- Static Scope: Scope of a variable can be statically determined.
Type 数据类型
- Uses of Type
- Program organization and documentation
- Identify and prevent errors
- Support optimization
- primitive data types: Those not defined in terms of other data types
- String
- 注意 C/C++ 里 String 是有上限的(maxsize = unsigned(-1))
- Static length: compile-time descriptor
- Limited dynamic length: may need a run-time descriptor for length (but not in C and C++)
- Dynamic length: need run-time descriptor; allocation/deallocation is the biggest implementation problem.
- Two common user-defined ordinal types
- Enumeration: All possible values, which are named constants, are provided or enumerated in the definition, e.g., C# example enum days {mon, tue, wed, thu, fri, sat, sun};
- Subrange: An ordered contiguous subsequence of an ordinal type, e.g., 12…18 is a subrange of integer type.
- Array
- Static: subscript ranges are statically bound and storage allocation is static (before run-time)
- Fixed stack-dynamic: subscript ranges are statically bound, but the allocation is done at declaration time during execution
- C and C++ arrays without static modifier
- Stack-dynamic: subscript ranges and storage allocation are dynamically bound at elaboration time and remain fixed during variable lifetime.
- Fixed heap-dynamic: storage binding is dynamic but fixed after allocation, allocated from heap
- C and C++ through malloc
- Heap-dynamic: binding of subscript ranges and storage is dynamic and can change
- C#: ArrayList
- Union
- In Fortran, C, and C++, no language-supported type checking for union, called free union
- unsafe: Do not allow type checking.
- Discriminated union: supported by Ada.
- In Fortran, C, and C++, no language-supported type checking for union, called free union
- Pointer and Reference
- pointer: has a range of values that consists of memory addresses and a special value, nil.
- Operations: assignment and dereferencing
- Dangling pointers (dangerous)
- Tombstone
- Locks-and-keys
- Lost heap-dynamic variable memory leakage
- reference : refers to an object or a value in memory, while a pointer refers to an address
- not sensible to do arithmetic on references
- Pointers or references are necessary for dynamic data structures.
- pointer: has a range of values that consists of memory addresses and a special value, nil.
- Heap Management
- Reference counters (eager)
- Garbage collection (lazy)
- Disadvantages: when you need it most, it works worst.
- Type Checking
- A programming language is strongly typed if type errors are always detected.
- FORTRAN 77 is not: EQUIVALENCE
- C and C++ are not: unions are not type checked
- Type Equivalence
- Name Type Equivalence (类型的)名字要严格一样
- Structure Type Equivalence 只需要有同样的结构
- A programming language is strongly typed if type errors are always detected.
递归和尾递归
并发
-
并发(concurrency):⼀个程序的多个任务同时执行
-
并行(parallellism):⼀个任务分解为多个字任务同时执行,协作完成⼀ 个问题
-
分布式:并行的计算在不同的计算机上进行
lambda表达式和流
- LJY的整理
- 四类基本的函数式接口:
Function:R apply(T t);
输入类型T返回类型R。Consumer:void accept(T t);
输入类型T,消费掉,无返回。Predicate:boolean test(T t);
输入类型T,并进行条件判断,返回true 或 false。Supplier:T get();
无输入,产生一个T类型的返回值。
Expression
- Side Effects
- Functions in pure mathematics do not have side effects
- Solution 1: define the language by disallowing functional side effects
- No two-way parameters in functions
- No non-local references in functions
- Disadvantage: inflexibility of one-way parameters and lack of non-local references
- Solution 2: write the language definition to demand that operand evaluation order be fixed
- Java requires that operands appear to be evaluated in left-to-right order
- Overloaded Operators
- Short Circuit Evaluation
- Type Conversions
- narrowing conversion
- widening conversion
Control
To be continued…
Subprograms
- activation record (AR)
- Format, or layout, of non-code part of an executing subprogram is called **activation record (AR) **.
- For a “simple” subprogram, AR has fixed size, and can be statically allocated (not in stack) .
- Allocate local variables on the run-time stack
+ Main advantage: support recursion
+ Use Base pointer (BP): Always points at the base of the activation record instance of the currently executing program unit.
+ When a subprogram is called, the current BP is saved in the new AR instance and the BP is set to point at the base of the new AR instance.
+ Upon return from the subprogram, BP is restored from the AR instance of the callee. - Local offset
- Local variables can be accessed by their offset from the beginning of the activation record, whose address is in the BP. This offset is called the local_offset
- The local offset of a local variable can be determined by the compiler at compile time.
- static link
- Static link in an AR instance points to bottom of AR instance of the static parent
- dynamic link
- 在子程序返回时恢复而保存的 base pointer 称为动态链指针
- static chain
- (relative chain depth, local_offset)to find a variable.
All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.