How do I install Embedding Strategies?

Install Embedding Strategies with a single command: npx mdskills install sickn33/embedding-strategies. This downloads the skill files into your project and your AI agent picks them up automatically.

What platforms support Embedding Strategies?

Embedding Strategies works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.

← Back to skills

Embedding Strategies

Name: Embedding Strategies: AI Agent Skill
Rating: 8 (1 reviews)
Author: sickn33

Verified

AI & Machine LearningIntermediate

Select and optimize embedding models for semantic search and RAG applications. Use when choosing embedding models, implementing chunking strategies, or optimizing embedding quality for specific domains.

by @sickn330Updated 2/20/2026

Add this skill

npx mdskills install sickn33/embedding-strategies

Fork & Edit

Skill Advisor8.0

Comprehensive embedding guide with practical templates for multiple models and chunking strategies

+Provides extensive code templates for OpenAI, local, and domain-specific embeddings
+Includes model comparison table and multiple chunking strategies with implementations
+Covers evaluation metrics and quality assessment for embeddings
-Requests shell/filesystem/network permissions but skill content doesn't justify shell execution

SKILL.md

Edit in Browser

1---
2name: embedding-strategies
3description: Select and optimize embedding models for semantic search and RAG applications. Use when choosing embedding models, implementing chunking strategies, or optimizing embedding quality for specific domains.
4---
5 
6# Embedding Strategies
7 
8Guide to selecting and optimizing embedding models for vector search applications.
9 
10## Do not use this skill when
11 
12- The task is unrelated to embedding strategies
13- You need a different domain or tool outside this scope
14 
15## Instructions
16 
17- Clarify goals, constraints, and required inputs.
18- Apply relevant best practices and validate outcomes.
19- Provide actionable steps and verification.
20- If detailed examples are required, open `resources/implementation-playbook.md`.
21 
22## Use this skill when
23 
24- Choosing embedding models for RAG
25- Optimizing chunking strategies
26- Fine-tuning embeddings for domains
27- Comparing embedding model performance
28- Reducing embedding dimensions
29- Handling multilingual content
30 
31## Core Concepts
32 
33### 1. Embedding Model Comparison
34 
35| Model | Dimensions | Max Tokens | Best For |
36|-------|------------|------------|----------|
37| **text-embedding-3-large** | 3072 | 8191 | High accuracy |
38| **text-embedding-3-small** | 1536 | 8191 | Cost-effective |
39| **voyage-2** | 1024 | 4000 | Code, legal |
40| **bge-large-en-v1.5** | 1024 | 512 | Open source |
41| **all-MiniLM-L6-v2** | 384 | 256 | Fast, lightweight |
42| **multilingual-e5-large** | 1024 | 512 | Multi-language |
43 
44### 2. Embedding Pipeline
45 
46```
47Document → Chunking → Preprocessing → Embedding Model → Vector
48                ↓
49        [Overlap, Size]  [Clean, Normalize]  [API/Local]
50```
51 
52## Templates
53 
54### Template 1: OpenAI Embeddings
55 
56```python
57from openai import OpenAI
58from typing import List
59import numpy as np
60 
61client = OpenAI()
62 
63def get_embeddings(
64    texts: List[str],
65    model: str = "text-embedding-3-small",
66    dimensions: int = None
67) -> List[List[float]]:
68    """Get embeddings from OpenAI."""
69    # Handle batching for large lists
70    batch_size = 100
71    all_embeddings = []
72 
73    for i in range(0, len(texts), batch_size):
74        batch = texts[i:i + batch_size]
75 
76        kwargs = {"input": batch, "model": model}
77        if dimensions:
78            kwargs["dimensions"] = dimensions
79 
80        response = client.embeddings.create(**kwargs)
81        embeddings = [item.embedding for item in response.data]
82        all_embeddings.extend(embeddings)
83 
84    return all_embeddings
85 
86 
87def get_embedding(text: str, **kwargs) -> List[float]:
88    """Get single embedding."""
89    return get_embeddings([text], **kwargs)[0]
90 
91 
92# Dimension reduction with OpenAI
93def get_reduced_embedding(text: str, dimensions: int = 512) -> List[float]:
94    """Get embedding with reduced dimensions (Matryoshka)."""
95    return get_embedding(
96        text,
97        model="text-embedding-3-small",
98        dimensions=dimensions
99    )
100```
101 
102### Template 2: Local Embeddings with Sentence Transformers
103 
104```python
105from sentence_transformers import SentenceTransformer
106from typing import List, Optional
107import numpy as np
108 
109class LocalEmbedder:
110    """Local embedding with sentence-transformers."""
111 
112    def __init__(
113        self,
114        model_name: str = "BAAI/bge-large-en-v1.5",
115        device: str = "cuda"
116    ):
117        self.model = SentenceTransformer(model_name, device=device)
118 
119    def embed(
120        self,
121        texts: List[str],
122        normalize: bool = True,
123        show_progress: bool = False
124    ) -> np.ndarray:
125        """Embed texts with optional normalization."""
126        embeddings = self.model.encode(
127            texts,
128            normalize_embeddings=normalize,
129            show_progress_bar=show_progress,
130            convert_to_numpy=True
131        )
132        return embeddings
133 
134    def embed_query(self, query: str) -> np.ndarray:
135        """Embed a query with BGE-style prefix."""
136        # BGE models benefit from query prefix
137        if "bge" in self.model.get_sentence_embedding_dimension():
138            query = f"Represent this sentence for searching relevant passages: {query}"
139        return self.embed([query])[0]
140 
141    def embed_documents(self, documents: List[str]) -> np.ndarray:
142        """Embed documents for indexing."""
143        return self.embed(documents)
144 
145 
146# E5 model with instructions
147class E5Embedder:
148    def __init__(self, model_name: str = "intfloat/multilingual-e5-large"):
149        self.model = SentenceTransformer(model_name)
150 
151    def embed_query(self, query: str) -> np.ndarray:
152        return self.model.encode(f"query: {query}")
153 
154    def embed_document(self, document: str) -> np.ndarray:
155        return self.model.encode(f"passage: {document}")
156```
157 
158### Template 3: Chunking Strategies
159 
160```python
161from typing import List, Tuple
162import re
163 
164def chunk_by_tokens(
165    text: str,
166    chunk_size: int = 512,
167    chunk_overlap: int = 50,
168    tokenizer=None
169) -> List[str]:
170    """Chunk text by token count."""
171    import tiktoken
172    tokenizer = tokenizer or tiktoken.get_encoding("cl100k_base")
173 
174    tokens = tokenizer.encode(text)
175    chunks = []
176 
177    start = 0
178    while start < len(tokens):
179        end = start + chunk_size
180        chunk_tokens = tokens[start:end]
181        chunk_text = tokenizer.decode(chunk_tokens)
182        chunks.append(chunk_text)
183        start = end - chunk_overlap
184 
185    return chunks
186 
187 
188def chunk_by_sentences(
189    text: str,
190    max_chunk_size: int = 1000,
191    min_chunk_size: int = 100
192) -> List[str]:
193    """Chunk text by sentences, respecting size limits."""
194    import nltk
195    sentences = nltk.sent_tokenize(text)
196 
197    chunks = []
198    current_chunk = []
199    current_size = 0
200 
201    for sentence in sentences:
202        sentence_size = len(sentence)
203 
204        if current_size + sentence_size > max_chunk_size and current_chunk:
205            chunks.append(" ".join(current_chunk))
206            current_chunk = []
207            current_size = 0
208 
209        current_chunk.append(sentence)
210        current_size += sentence_size
211 
212    if current_chunk:
213        chunks.append(" ".join(current_chunk))
214 
215    return chunks
216 
217 
218def chunk_by_semantic_sections(
219    text: str,
220    headers_pattern: str = r'^#{1,3}\s+.+$'
221) -> List[Tuple[str, str]]:
222    """Chunk markdown by headers, preserving hierarchy."""
223    lines = text.split('\n')
224    chunks = []
225    current_header = ""
226    current_content = []
227 
228    for line in lines:
229        if re.match(headers_pattern, line, re.MULTILINE):
230            if current_content:
231                chunks.append((current_header, '\n'.join(current_content)))
232            current_header = line
233            current_content = []
234        else:
235            current_content.append(line)
236 
237    if current_content:
238        chunks.append((current_header, '\n'.join(current_content)))
239 
240    return chunks
241 
242 
243def recursive_character_splitter(
244    text: str,
245    chunk_size: int = 1000,
246    chunk_overlap: int = 200,
247    separators: List[str] = None
248) -> List[str]:
249    """LangChain-style recursive splitter."""
250    separators = separators or ["\n\n", "\n", ". ", " ", ""]
251 
252    def split_text(text: str, separators: List[str]) -> List[str]:
253        if not text:
254            return []
255 
256        separator = separators[0]
257        remaining_separators = separators[1:]
258 
259        if separator == "":
260            # Character-level split
261            return [text[i:i+chunk_size] for i in range(0, len(text), chunk_size - chunk_overlap)]
262 
263        splits = text.split(separator)
264        chunks = []
265        current_chunk = []
266        current_length = 0
267 
268        for split in splits:
269            split_length = len(split) + len(separator)
270 
271            if current_length + split_length > chunk_size and current_chunk:
272                chunk_text = separator.join(current_chunk)
273 
274                # Recursively split if still too large
275                if len(chunk_text) > chunk_size and remaining_separators:
276                    chunks.extend(split_text(chunk_text, remaining_separators))
277                else:
278                    chunks.append(chunk_text)
279 
280                # Start new chunk with overlap
281                overlap_splits = []
282                overlap_length = 0
283                for s in reversed(current_chunk):
284                    if overlap_length + len(s) <= chunk_overlap:
285                        overlap_splits.insert(0, s)
286                        overlap_length += len(s)
287                    else:
288                        break
289                current_chunk = overlap_splits
290                current_length = overlap_length
291 
292            current_chunk.append(split)
293            current_length += split_length
294 
295        if current_chunk:
296            chunks.append(separator.join(current_chunk))
297 
298        return chunks
299 
300    return split_text(text, separators)
301```
302 
303### Template 4: Domain-Specific Embedding Pipeline
304 
305```python
306class DomainEmbeddingPipeline:
307    """Pipeline for domain-specific embeddings."""
308 
309    def __init__(
310        self,
311        embedding_model: str = "text-embedding-3-small",
312        chunk_size: int = 512,
313        chunk_overlap: int = 50,
314        preprocessing_fn=None
315    ):
316        self.embedding_model = embedding_model
317        self.chunk_size = chunk_size
318        self.chunk_overlap = chunk_overlap
319        self.preprocess = preprocessing_fn or self._default_preprocess
320 
321    def _default_preprocess(self, text: str) -> str:
322        """Default preprocessing."""
323        # Remove excessive whitespace
324        text = re.sub(r'\s+', ' ', text)
325        # Remove special characters
326        text = re.sub(r'[^\w\s.,!?-]', '', text)
327        return text.strip()
328 
329    async def process_documents(
330        self,
331        documents: List[dict],
332        id_field: str = "id",
333        content_field: str = "content",
334        metadata_fields: List[str] = None
335    ) -> List[dict]:
336        """Process documents for vector storage."""
337        processed = []
338 
339        for doc in documents:
340            content = doc[content_field]
341            doc_id = doc[id_field]
342 
343            # Preprocess
344            cleaned = self.preprocess(content)
345 
346            # Chunk
347            chunks = chunk_by_tokens(
348                cleaned,
349                self.chunk_size,
350                self.chunk_overlap
351            )
352 
353            # Create embeddings
354            embeddings = get_embeddings(chunks, self.embedding_model)
355 
356            # Create records
357            for i, (chunk, embedding) in enumerate(zip(chunks, embeddings)):
358                record = {
359                    "id": f"{doc_id}_chunk_{i}",
360                    "document_id": doc_id,
361                    "chunk_index": i,
362                    "text": chunk,
363                    "embedding": embedding
364                }
365 
366                # Add metadata
367                if metadata_fields:
368                    for field in metadata_fields:
369                        if field in doc:
370                            record[field] = doc[field]
371 
372                processed.append(record)
373 
374        return processed
375 
376 
377# Code-specific pipeline
378class CodeEmbeddingPipeline:
379    """Specialized pipeline for code embeddings."""
380 
381    def __init__(self, model: str = "voyage-code-2"):
382        self.model = model
383 
384    def chunk_code(self, code: str, language: str) -> List[dict]:
385        """Chunk code by functions/classes."""
386        import tree_sitter
387 
388        # Parse with tree-sitter
389        # Extract functions, classes, methods
390        # Return chunks with context
391        pass
392 
393    def embed_with_context(self, chunk: str, context: str) -> List[float]:
394        """Embed code with surrounding context."""
395        combined = f"Context: {context}\n\nCode:\n{chunk}"
396        return get_embedding(combined, model=self.model)
397```
398 
399### Template 5: Embedding Quality Evaluation
400 
401```python
402import numpy as np
403from typing import List, Tuple
404 
405def evaluate_retrieval_quality(
406    queries: List[str],
407    relevant_docs: List[List[str]],  # List of relevant doc IDs per query
408    retrieved_docs: List[List[str]],  # List of retrieved doc IDs per query
409    k: int = 10
410) -> dict:
411    """Evaluate embedding quality for retrieval."""
412 
413    def precision_at_k(relevant: set, retrieved: List[str], k: int) -> float:
414        retrieved_k = retrieved[:k]
415        relevant_retrieved = len(set(retrieved_k) & relevant)
416        return relevant_retrieved / k
417 
418    def recall_at_k(relevant: set, retrieved: List[str], k: int) -> float:
419        retrieved_k = retrieved[:k]
420        relevant_retrieved = len(set(retrieved_k) & relevant)
421        return relevant_retrieved / len(relevant) if relevant else 0
422 
423    def mrr(relevant: set, retrieved: List[str]) -> float:
424        for i, doc in enumerate(retrieved):
425            if doc in relevant:
426                return 1 / (i + 1)
427        return 0
428 
429    def ndcg_at_k(relevant: set, retrieved: List[str], k: int) -> float:
430        dcg = sum(
431            1 / np.log2(i + 2) if doc in relevant else 0
432            for i, doc in enumerate(retrieved[:k])
433        )
434        ideal_dcg = sum(1 / np.log2(i + 2) for i in range(min(len(relevant), k)))
435        return dcg / ideal_dcg if ideal_dcg > 0 else 0
436 
437    metrics = {
438        f"precision@{k}": [],
439        f"recall@{k}": [],
440        "mrr": [],
441        f"ndcg@{k}": []
442    }
443 
444    for relevant, retrieved in zip(relevant_docs, retrieved_docs):
445        relevant_set = set(relevant)
446        metrics[f"precision@{k}"].append(precision_at_k(relevant_set, retrieved, k))
447        metrics[f"recall@{k}"].append(recall_at_k(relevant_set, retrieved, k))
448        metrics["mrr"].append(mrr(relevant_set, retrieved))
449        metrics[f"ndcg@{k}"].append(ndcg_at_k(relevant_set, retrieved, k))
450 
451    return {name: np.mean(values) for name, values in metrics.items()}
452 
453 
454def compute_embedding_similarity(
455    embeddings1: np.ndarray,
456    embeddings2: np.ndarray,
457    metric: str = "cosine"
458) -> np.ndarray:
459    """Compute similarity matrix between embedding sets."""
460    if metric == "cosine":
461        # Normalize
462        norm1 = embeddings1 / np.linalg.norm(embeddings1, axis=1, keepdims=True)
463        norm2 = embeddings2 / np.linalg.norm(embeddings2, axis=1, keepdims=True)
464        return norm1 @ norm2.T
465    elif metric == "euclidean":
466        from scipy.spatial.distance import cdist
467        return -cdist(embeddings1, embeddings2, metric='euclidean')
468    elif metric == "dot":
469        return embeddings1 @ embeddings2.T
470```
471 
472## Best Practices
473 
474### Do's
475- **Match model to use case** - Code vs prose vs multilingual
476- **Chunk thoughtfully** - Preserve semantic boundaries
477- **Normalize embeddings** - For cosine similarity
478- **Batch requests** - More efficient than one-by-one
479- **Cache embeddings** - Avoid recomputing
480 
481### Don'ts
482- **Don't ignore token limits** - Truncation loses info
483- **Don't mix embedding models** - Incompatible spaces
484- **Don't skip preprocessing** - Garbage in, garbage out
485- **Don't over-chunk** - Lose context
486 
487## Resources
488 
489- [OpenAI Embeddings](https://platform.openai.com/docs/guides/embeddings)
490- [Sentence Transformers](https://www.sbert.net/)
491- [MTEB Benchmark](https://huggingface.co/spaces/mteb/leaderboard)
492

Full transparency — inspect the skill content before installing.