Please note from now onwards most of my stories will be
Originally published on iHateReading
Under the Hood
The story began when I was researching how generative AI can generate AI-based websites.
For example, we provide prompts and AI returns the javascript or React codebase giving a landing page in response.
This part is very tricky because of hallucinations done by generative AI or GPT models. Understand this we can't give GPT models complete freedom to generate whatever kind of output they can.
The response from the GPT models or generative AI should be limited to certain kinds of rules and those rules are defined via an abstract syntax and that syntax is often termed AST(Abstract Syntax Tree)
Basic Explanation
Imagine a 5-year-old child named Lily who loves to play and is extremely imaginative. Lily is learning to write small words like "dog" and "cat," recognize the difference between small and capital letters, and draw more recognizable pictures like houses and flowers. She is also starting to exclude other children during play as she forms regular friendships. Lily can count up to 100, do basic math like adding apples, and tie her shoes. Additionally, she enjoys reciting nursery rhymes, recognizing rhymes in books, and acting out stories with her toys.
In this example, Lily's development showcases the complexity and diversity of skills a 5-year-old child acquires. ASTs become crucial in programming to analyze and understand the structure of code, just like how we observe and understand the developmental milestones and interconnected skills of a child-like Lily.
ASTs help programmers navigate through code, identify patterns, and make transformations efficiently, much like how parents and educators track a child's progress and provide appropriate guidance for their growth and learning. The importance of ASTs lies in their ability to represent the intricate relationships and structures within code, enabling developers to write more efficient and error-free software by understanding the code's syntax and semantics deeply.
What is AST(Abstract Syntax Trees)?
Abstract Syntax Trees (ASTs) are data structures used in computer science to represent the structure of a program or code snippet. They are often used in the context of programming languages that are specified as context-free grammar (CFG), but there are often aspects of programming languages that a CFG cannot express, such as variable types, operator overloading, and duck typing.
ASTs can preserve variable types, the location of each declaration in source code, the order of executable statements, and identifiers and their assigned values. They are used intensively during semantic analysis, where the compiler checks for correct usage of the elements of the program and the language.
The AST is also used to generate an intermediate representation (IR) for code generation.ASTs have a wide range of applications and use cases, including:
- Compilers: ASTs are used in compilers to represent the structure of a program and to perform semantic analysis, where the compiler checks for correct usage of the elements of the program and the language.
- Linting: ASTs are used in linting tools like ESLint to traverse the tree and gain insights or perform actions based on the different nodes of the tree.
- Code Transformation: ASTs are used in code transformation tools like Babel to modify the tree and down-transpile newer features or to turn JSX into function calls.
- Code Formatting: ASTs are used in code formatting tools like Prettier to format code based on the tree structure.
- Code Clone Detection: ASTs are used in code clone detection tools to detect similar code patterns and structures.
- AST Differencing: AST differencing, or tree differencing, is used to compute the list of differences between two ASTs and to generate an edit script that directly refers to the AST of the code.
Tools based on AST
Babel
Babel is a JavaScript compiler that includes an AST parser and generator. It can be used to convert code written in the latest JavaScript features to an older version that is compatible with more browsers.
TypeScript
TypeScript is a superset of JavaScript that adds static typing and other features, with an AST representation. It can be used to catch errors early in the development process, and it can also help to make code more readable and maintainable.
ESLint
ESLint is a static analysis tool for JavaScript that uses ASTs to represent code and perform checks. It can be used to enforce coding standards, catch errors early in the development process, and improve the overall quality of your code.
Prettier
Prettier is a code formatter that uses ASTs to represent code and perform formatting. It can be used to automatically format your code according to a consistent style, saving you time and helping to ensure that your code is easy to read and maintain.
Jest
Jest is a testing framework for JavaScript that uses ASTs to represent code and perform tests. It can be used to write unit tests, integration tests, and end-to-end tests, and it includes features such as snapshot testing and code coverage analysis.
Babylon
Babylon is a JavaScript parser that generates ASTs. It can be used to parse JavaScript code and generate an AST that can be used for further processing or analysis.
Recast
Recast is a tool for transforming JavaScript code using ASTs. It can be used to perform a wide range of code transformations, including code minification, code optimization, and code generation.
AST Explorer
AST Explorer is a web-based tool for visualizing and exploring ASTs. It can be used to understand the structure of your code, debug issues, and experiment with code transformations.
AST Types
AST Types is an implementation of the abstract syntax tree type hierarchy used by many JavaScript tools. It can be used to provide a consistent and standardized way of working with ASTs, and it includes type definitions for many popular JavaScript tools.
GitHub | [Website](https://github
Advantages
An Abstract Syntax Tree (AST) is a tree-like structure that represents the syntactic structure of a program, abstracting away certain details and retaining just enough information to help the compiler understand the structure of the code. ASTs are typically generated during the parsing phase of the compilation process, where source code is broken down into tokens and then parsed into a tree-like structure. The advantages of using an AST over other code analysis techniques include:
- Quicker to produce: An AST is quicker to produce than most compilers do to generate code, making it a more efficient way to analyze code[source].
- Simpler than compilers: AST interpreters tend to be simpler than compilers because the whole code generation phase can be ignored. This makes them easier to work with and maintain[source].
- Faster development: If you have a program that doesn't do heavy computation, it will be up and running faster with an interpreter. This can lead to faster development and prototyping[source].
- Code visualization: ASTs can be used to visualize the structure of code, making it easier to understand and navigate. This can be particularly useful for large and complex codebases[source].
- Code transformation: ASTs can be manipulated to transform code from one form to another. This can be useful for optimizing code, adding new features, or fixing bugs[source].
- Code analysis: ASTs can be analyzed to identify potential issues or vulnerabilities in the code. This can be particularly useful for security-critical applications or for enforcing coding standards.
- Code generation: ASTs can be compiled into machine code or bytecode, making it possible to generate executable code from the AST. In summary, ASTs offer a more efficient and flexible way to analyze and manipulate code, making them a powerful tool for developers and researchers alike.
That's it for today
See you in the next one
Shrey