Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions packages/embeddinggemma-embedding-bridge/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# EmbeddingGemma Embedding Bridge

Google의 온디바이스 임베딩 모델인 **EmbeddingGemma**를 `@xenova/transformers` 런타임 위에서 사용할 수 있도록 감싼 브릿지 패키지입니다.

## 설치

```bash
pnpm add embeddinggemma-embedding-bridge @xenova/transformers
```

## 빠른 시작

```ts
import { createEmbeddingGemmaBridge } from 'embeddinggemma-embedding-bridge';

const bridge = createEmbeddingGemmaBridge();
const { embeddings } = await bridge.embed({ input: '안녕하세요' });

console.log('embedding length:', embeddings.length);
```

## 구성 옵션

`createEmbeddingGemmaBridge` 혹은 생성자에 전달되는 설정은 모두 선택 사항이며, 필요한 부분만 오버라이드할 수 있습니다.

```ts
const bridge = createEmbeddingGemmaBridge({
model: 'google/embedding-gemma-002',
pipeline: {
revision: 'main',
quantized: true,
cacheDir: '/models/gemma',
localFilesOnly: true,
device: 'gpu',
},
embedding: {
pooling: 'cls',
normalize: false,
batchSize: 4,
},
});
```

### 일반 옵션

- `model` – 로드할 Hugging Face 모델 ID. 기본값은 `google/embedding-gemma-002` 입니다.

### `pipeline`

`@xenova/transformers`의 `pipeline` 생성 시 전달되는 옵션입니다.

- `revision` – 모델 리비전 지정
- `quantized` – 양자화된 체크포인트 사용 여부
- `cacheDir` – 모델 다운로드/캐시 경로 (`cache_dir`)
- `localFilesOnly` – 오프라인 캐시만 사용할지 여부 (`local_files_only`)
- `progressCallback` – 모델 로딩 진행률 콜백
- `device` – 실행 디바이스 (`'cpu'`, `'gpu'`, 숫자 인덱스 등 지원)
- `dtype` – 가중치 데이터 타입 (예: `'fp16'`)
- `executionProviders` – ONNX Runtime 실행 프로바이더 목록

### `embedding`

실제 임베딩 계산 시 파이프라인 호출에 전달되는 옵션입니다.

- `pooling` – `'mean' | 'max' | 'cls'` (기본값 `'mean'`)
- `normalize` – L2 정규화 여부 (기본값 `true`)
- `batchSize` – 배치 처리 크기 (`batch_size`)

## 동작 특성

- 입력은 문자열 또는 `contentType: 'text'` 인 멀티모달 콘텐츠만 지원합니다.
- 파이프라인에서 반환된 텐서를 자동으로 평탄화하여 `number[]` 혹은 `number[][]` 형태로 제공합니다.
- `getMetadata()`는 모델 설정의 hidden size를 추정하여 임베딩 차원을 알려줍니다.

## 테스트

```bash
pnpm --filter embeddinggemma-embedding-bridge test:ci
```
54 changes: 54 additions & 0 deletions packages/embeddinggemma-embedding-bridge/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
{
"name": "embeddinggemma-embedding-bridge",
"version": "0.0.1",
"description": "EmbeddingGemma Embedding Bridge",
"main": "./dist/index.js",
"module": "./esm/index.js",
"types": "./dist/index.d.ts",
"exports": {
".": {
"import": "./esm/index.js",
"require": "./dist/index.js",
"types": "./dist/index.d.ts"
}
},
"files": [
"dist",
"esm",
"README.md"
],
"sideEffects": false,
"scripts": {
"build": "pnpm clean && tsc -p tsconfig.json && tsc -p tsconfig.esm.json",
"dev": "tsc -p tsconfig.json",
"test": "vitest run",
"test:ci": "vitest run --exclude='src/**/*.e2e.test.ts'",
"test:watch": "vitest",
"test:coverage": "vitest run --coverage",
"lint": "eslint src --ext .ts",
"lint:fix": "eslint src --ext .ts --fix",
"clean": "rm -rf dist && rm -rf esm"
},
"dependencies": {},
"devDependencies": {
"embedding-bridge-spec": "workspace:*",
"llm-bridge-spec": "workspace:*",
"@xenova/transformers": "^2.17.2",
"zod": "^4.0.5",
"@types/node": "^20.11.24",
"@typescript-eslint/eslint-plugin": "^7.1.0",
"@typescript-eslint/parser": "^7.1.0",
"@vitest/coverage-v8": "^1.0.0",
"eslint": "^8.57.0",
"rimraf": "^5.0.5",
"typescript": "^5.0.0",
"vitest": "^1.0.0",
"vitest-mock-extended": "^3.1.0"
},
"peerDependencies": {
"embedding-bridge-spec": "workspace:*",
"llm-bridge-spec": "workspace:*",
"@xenova/transformers": "^2.17.2",
"zod": "^4.0.5"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
import type { MultiModalContent } from 'llm-bridge-spec';
import { beforeEach, describe, expect, it, vi } from 'vitest';
import { EmbeddingGemmaBridge } from '../bridge/embeddinggemma-bridge';
import { createEmbeddingGemmaBridge } from '../bridge/embeddinggemma-factory';

const { pipelineSpy } = vi.hoisted(() => ({
pipelineSpy: vi.fn(),
}));

vi.mock('@xenova/transformers', () => ({
pipeline: pipelineSpy,
}));

interface TensorMockOptions {
configHiddenSize?: number | null;
modelHiddenSize?: number;
}

const createTensor = (values: number[], dims?: number[]) => ({
data: Float32Array.from(values),
dims,
});

function setupPipelineMock(returnValue: unknown, options: TensorMockOptions = {}) {
const { configHiddenSize, modelHiddenSize } = options;
const pipelineInstance = vi.fn(async () => returnValue);
const dims =
returnValue &&
typeof returnValue === 'object' &&
'dims' in (returnValue as Record<string, unknown>)
? ((returnValue as { dims?: number[] }).dims ?? undefined)
: undefined;

const inferredDimension = Array.isArray(dims) && dims.length > 0 ? dims.at(-1) : undefined;
const hasConfigOverride = Object.prototype.hasOwnProperty.call(options, 'configHiddenSize');
const configValue = hasConfigOverride ? configHiddenSize : (inferredDimension ?? 2);

const config: Record<string, unknown> = {};
if (typeof configValue === 'number') {
config.hidden_size = configValue;
}

const modelConfig: Record<string, unknown> = {};
const resolvedModelSize =
typeof modelHiddenSize === 'number'
? modelHiddenSize
: typeof configValue === 'number'
? configValue
: (inferredDimension ?? 2);
modelConfig.hidden_size = resolvedModelSize;

Object.assign(pipelineInstance, {
config,
model: { config: modelConfig },
});

pipelineSpy.mockResolvedValue(pipelineInstance);
return pipelineInstance;
}

function expectVectorCloseTo(actual: number[], expected: number[]) {
expect(actual).toHaveLength(expected.length);
expected.forEach((value, index) => {
expect(actual[index]).toBeCloseTo(value, 6);
});
}

function isNumberArray(value: unknown): value is number[] {
return Array.isArray(value) && value.every(item => typeof item === 'number');
}

function isNumberMatrix(value: unknown): value is number[][] {
return Array.isArray(value) && value.every(isNumberArray);
}

function expectEmbeddingVector(actual: number[] | number[][], expected: number[]) {
expect(isNumberArray(actual)).toBe(true);
if (!isNumberArray(actual)) {
throw new Error('Expected embedding vector');
}
expectVectorCloseTo(actual, expected);
}

function expectEmbeddingMatrix(actual: number[] | number[][], expected: number[][]) {
expect(isNumberMatrix(actual)).toBe(true);
if (!isNumberMatrix(actual)) {
throw new Error('Expected embedding matrix');
}
expect(actual).toHaveLength(expected.length);
actual.forEach((row, index) => {
expectVectorCloseTo(row, expected[index]);
});
}

describe('EmbeddingGemmaBridge', () => {
beforeEach(() => {
pipelineSpy.mockReset();
});

it('should create embedding for text input', async () => {
const pipelineInstance = setupPipelineMock(createTensor([0.1, 0.2], [2]));
const bridge = new EmbeddingGemmaBridge();

const res = await bridge.embed({ input: 'hello' });

expectEmbeddingVector(res.embeddings, [0.1, 0.2]);
expect(pipelineSpy).toHaveBeenCalledWith(
'feature-extraction',
'google/embedding-gemma-002',
undefined
);
expect(pipelineInstance).toHaveBeenCalledWith('hello', { pooling: 'mean', normalize: true });
});

it('should support array input and return batched embeddings', async () => {
const pipelineInstance = setupPipelineMock(createTensor([0.1, 0.2, 0.3, 0.4], [2, 2]));
const bridge = new EmbeddingGemmaBridge();

const res = await bridge.embed({ input: ['a', 'b'] });

expectEmbeddingMatrix(res.embeddings, [
[0.1, 0.2],
[0.3, 0.4],
]);
expect(pipelineInstance).toHaveBeenCalledWith(['a', 'b'], { pooling: 'mean', normalize: true });
});

it('should convert multimodal text input to string', async () => {
const pipelineInstance = setupPipelineMock(createTensor([0.2, -0.1], [2]));
const bridge = new EmbeddingGemmaBridge();

await bridge.embed({ input: { contentType: 'text', value: 'multi' } });

expect(pipelineInstance).toHaveBeenCalledWith('multi', { pooling: 'mean', normalize: true });
});

it('should reject non-text multimodal content', async () => {
const bridge = new EmbeddingGemmaBridge();

const imageContent: MultiModalContent = { contentType: 'image', value: Buffer.from('') };

await expect(bridge.embed({ input: imageContent })).rejects.toThrow(
'Only text content is supported for embeddings'
);
expect(pipelineSpy).not.toHaveBeenCalled();
});

it('should reject empty input arrays', async () => {
const bridge = new EmbeddingGemmaBridge();

await expect(bridge.embed({ input: [] })).rejects.toThrow(
'Embedding request must include at least one input item.'
);
expect(pipelineSpy).not.toHaveBeenCalled();
});

it('should apply pipeline and embedding configuration overrides', async () => {
const pipelineInstance = setupPipelineMock(createTensor([0.1, 0.2, 0.3, 0.4], [2, 2]));
const bridge = new EmbeddingGemmaBridge({
pipeline: {
cacheDir: '/tmp/cache',
localFilesOnly: true,
quantized: true,
revision: 'main',
device: 'gpu',
dtype: 'fp16',
executionProviders: ['cpu'],
},
embedding: {
pooling: 'cls',
normalize: false,
batchSize: 8,
},
});

const res = await bridge.embed({ input: ['a', 'b'] });

expectEmbeddingMatrix(res.embeddings, [
[0.1, 0.2],
[0.3, 0.4],
]);
expect(pipelineSpy).toHaveBeenCalledWith(
'feature-extraction',
'google/embedding-gemma-002',
expect.objectContaining({
cache_dir: '/tmp/cache',
local_files_only: true,
quantized: true,
revision: 'main',
device: 'gpu',
dtype: 'fp16',
execution_providers: ['cpu'],
})
);
expect(pipelineInstance).toHaveBeenCalledWith(['a', 'b'], {
pooling: 'cls',
normalize: false,
batch_size: 8,
});
});

it('should derive metadata dimension from model config before embedding', async () => {
const pipelineInstance = setupPipelineMock(createTensor([0.1, 0.2], [2]), {
configHiddenSize: null,
modelHiddenSize: 768,
});
const bridge = new EmbeddingGemmaBridge();

const metadata = await bridge.getMetadata();

expect(metadata).toEqual({ model: 'google/embedding-gemma-002', dimension: 768 });
expect(pipelineInstance).not.toHaveBeenCalled();
});

it('factory should accept empty configuration', async () => {
const pipelineInstance = setupPipelineMock(createTensor([0.3, 0.7], [2]));
const bridge = createEmbeddingGemmaBridge();

const res = await bridge.embed({ input: 'factory' });

expectEmbeddingVector(res.embeddings, [0.3, 0.7]);
expect(pipelineInstance).toHaveBeenCalledWith('factory', { pooling: 'mean', normalize: true });
});

it('should surface errors for malformed batched pipeline output', async () => {
setupPipelineMock([0.1, 0.2]);
const bridge = new EmbeddingGemmaBridge();

await expect(bridge.embed({ input: ['a', 'b'] })).rejects.toThrow(
'Expected batched embeddings but received a single vector from the pipeline.'
);
});

it('should surface errors when tensor output length mismatches batch size', async () => {
setupPipelineMock(createTensor([0.1, 0.2, 0.3], [2, 2]));
const bridge = new EmbeddingGemmaBridge();

await expect(bridge.embed({ input: ['x', 'y'] })).rejects.toThrow(
'Unexpected pipeline output length for the provided input batch.'
);
});
});
Loading