Automatic TypeScript definition generation from JSG RTTI#113
Conversation
|
@mrbbot ... ok, so the reason we had to revert this was twofold:
@mikea also needs to review the code changes here before it can land. The changes here LGTM but Mike is the domain expert so while I might sign off it should wait to land until both of the following happen:
|
11d9b65 to
f83ad19
Compare
|
This may have been fixed in newer commits, but slight bug in the version you posted; everything URL related is prefixed with URL. So, for instance, the URL type is urlURL. |
de64f5a to
4cd2b6f
Compare
mikea
left a comment
There was a problem hiding this comment.
I looked through all changes outside of types/. Sending comments so far.
I know it could be a hustle, but maybe we can break this PR into several safer ones?
In particular I'd like to see the rtti change and npm setup as separate PRs, since they have very different risk profiles.
types/src/generator/index.ts
Outdated
There was a problem hiding this comment.
@kentonv maybe we want this in src/types? or src/workerd/types?
There was a problem hiding this comment.
@kentonv .. just pinging again on this.
However, I don't think answering this should block this PR landing because we can always move this if necessary.
WORKSPACE
Outdated
There was a problem hiding this comment.
this will definitely make embedding workerd hard.
Maybe extract this to a separate function setup_js_dependencies in bzl file so that it can be called by downstream deps if needed?
There was a problem hiding this comment.
Unless I'm missing something, I don't think you can call load() in functions (bazelbuild/bazel#1550)? So we wouldn't be able to call rules_js_dependencies, rules_ts_dependencies, npm_translate_lock or npm_repositories in our function? We have the same problem with Rust dependencies atm: the internal repository currently duplicates all the Rust WORKSPACE rules. 😕
@penalosa we'll fix this when we add overrides that can rename types. 👍 The issue is that the spec-compliant URL implementation is under an additional |
mikea
left a comment
There was a problem hiding this comment.
TS code is rather straightforward (though a bit dense). I have left some comments.
Having some testing is probably my biggest concern now.
| import ts from "typescript"; | ||
| import { printNode } from "../print"; | ||
|
|
||
| // Replaces custom Iterator-like interfaces with built-in `Iterator` types: |
There was a problem hiding this comment.
I wonder: why do we generate something intermediate and patch it later, rather than generate correct property immediately?
I.e. when generating a property, check that its type is our iterator?
There was a problem hiding this comment.
Doing it this way allows us to keep the initial generation code simple and separate concerns. As JSG_ITERATOR generates 2 structures (one for the iterator, one for the next result containing the value type we need), the initial generation functions would need to be aware of the full program as opposed to just the structure they were generating code for, making testing more difficult. This approach also allows us to use TypeScript's transformer/visitation APIs that were designed for tasks like this. I'm not too worried about squeezing performance out of this program either.
|
|
||
| // If `typeNode` has the shape `T | undefined`, returns `T`, otherwise returns | ||
| // `undefined`. | ||
| export function maybeUnwrapOptional( |
There was a problem hiding this comment.
Do we really need to consult generated types to glimpes information from the original structure?
There was a problem hiding this comment.
If we used the original structure information, we'd need to separately assert the generated type nodes had the expected optional shape T | undefined in order to extract T and satisfy TypeScript. Therefore, it makes sense to do the check like this, which narrows the TypeScript type for us. This helper is also useful in structure.ts, when generating properties.
|
Exciting. |
| // Input: | ||
| // Binary Cap’n Proto file path, defaults to reading from stdin if omitted | ||
| export async function main(args?: string[]) { | ||
| const { values: options, positionals } = util.parseArgs({ |
There was a problem hiding this comment.
I would just caution here that util.parseArgs() is still considered experimental in Node.js. It's API can see breaking changes outside of normal semver rules. It's not expected to change, but it's worth keeping in mind.
|
|
||
| This directory contains scripts for automatically generating TypeScript types | ||
| from [JSG RTTI](../src/workerd/jsg/rtti.h). | ||
|
|
There was a problem hiding this comment.
Nit: not blocking for this PR but a request for later: It would be worthwhile documenting the dependencies used here... specifically the minimum Node.js version. I know the bazel stuff brings in the right version but just for detail information, especially given that the tool uses a number of experimental Node.js APIs that are fairly recent additions.
|
It appears that all of @mikea's feedback has been addressed. I'll clear their "change requested" now. |
|
@mrbbot this should be ready to land. Once landed, remember to rebase the matching internal PR on workerd main and let the CI run again there to confirm it's ready to land. |
(re-do of #110, let's try this again 🙂 )
Hey! 👋 This PR adds support for generating TypeScript types from JSG RTTI, replacing the internal
autodeclscript. This forms the basis for the nextworkers-typesversion. These scripts are located in this repository rather thanworkers-typesas they depend on Bazel outputs, and we'd like to be able to share the Bazel cache in CI. Going forward, DevProd should probably be theCODEOWNERfor everything in thetypesdirectory.To generate types, run:
A gist containing the generated types can be found here. This also includes a copy of the types before TypeScript transformers are applied.
(fyi, about half of the additions are from
pnpm-lock.yaml)Implementation Notes and Questions
jsg::fullyQualifiedTypeName()method. This behaves likejsg::typeName(), but includes namespaces and template arguments in the returned name. Namespaces are required to differentiate nested types with the same name (e.g.DurableObjectStorageOperations::GetOptions,KvNamespace::GetOptionsandR2Bucket::GetOptions). Template arguments are required to differentiatejsg::IteratorBase...types generated byJSG_ITERATOR.workerd::jsg::rtti::Builder, replaced the symbol key fromjsg::typeName()tojsg::fullyQualifiedTypeName()for the reasons above. This also affects which values should be passed as parameters toworkerd::jsg::rtti::Builder::structure().rtti.capnpschema. Generally, I tried to evolve the schema in a backwards-compatible way, but it would be cleaner if we're still able to make breaking changes. /cc @mikeafullyQualifiedNamefields toStructureandStructureTypeschemas for the reasons above. I was hesitant to makenamefully-qualified since I wasn't sure which other code depended on RTTI.jsg::LenientOptionalto RTTI. When generating TypeScript definitions, we also need to be able to distinguish between optionals expectingnull(kj::Maybe) and others expectingundefined. A newnamefield has been added to theMaybeTypeschema, similar toNumberTypeandStringType.kj::Array,kj::ArrayPtrandjsg::Sequence, so a newnamefield has been added to theArrayTypeschema. It may be better to make this and the previous maybe field enums instead?namefield in a group withnestedmembers. This ensures members coming fromJSG_NESTED_TYPE_NAMEDmacros have the correct names.iteratorandasyncIteratorfields to theStructureschema. Whilst there are alreadyiterableandasyncIterableboolean fields, we need to know the full method types (especially the returned(Async)Iteratortype) for[Symbol.iterator]/[Symbol.asyncIterator].api-encoder.c++entrypoint that spits out RTTI to a file. All TypeScript generation scripts are written in TypeScript, so we can use the official TypeScript compiler API for creating/processing/printing AST nodes.aspect-build/rules_jsrequires us to use pnpm. We should be able to use this setup for packaging/publishingworkerdnpmpackages too. /cc @penalosaTODOs
(to follow in later PRs)
autodecl😅), they're not as accurate as they could be.param0,param1, ... . Ideally, we'd use the actual C++ parameter names here, so we need some way of getting these into the RTTI.