Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Add this skill
npx mdskills install sickn33/xlsx-officialComprehensive Excel skill with industry-standard formatting rules and formula workflows
1---2name: xlsx3description: "Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas"4license: Proprietary. LICENSE.txt has complete terms5---67# Requirements for Outputs89## All Excel files1011### Zero Formula Errors12- Every Excel model MUST be delivered with ZERO formula errors (#REF!, #DIV/0!, #VALUE!, #N/A, #NAME?)1314### Preserve Existing Templates (when updating templates)15- Study and EXACTLY match existing format, style, and conventions when modifying files16- Never impose standardized formatting on files with established patterns17- Existing template conventions ALWAYS override these guidelines1819## Financial models2021### Color Coding Standards22Unless otherwise stated by the user or existing template2324#### Industry-Standard Color Conventions25- **Blue text (RGB: 0,0,255)**: Hardcoded inputs, and numbers users will change for scenarios26- **Black text (RGB: 0,0,0)**: ALL formulas and calculations27- **Green text (RGB: 0,128,0)**: Links pulling from other worksheets within same workbook28- **Red text (RGB: 255,0,0)**: External links to other files29- **Yellow background (RGB: 255,255,0)**: Key assumptions needing attention or cells that need to be updated3031### Number Formatting Standards3233#### Required Format Rules34- **Years**: Format as text strings (e.g., "2024" not "2,024")35- **Currency**: Use $#,##0 format; ALWAYS specify units in headers ("Revenue ($mm)")36- **Zeros**: Use number formatting to make all zeros "-", including percentages (e.g., "$#,##0;($#,##0);-")37- **Percentages**: Default to 0.0% format (one decimal)38- **Multiples**: Format as 0.0x for valuation multiples (EV/EBITDA, P/E)39- **Negative numbers**: Use parentheses (123) not minus -1234041### Formula Construction Rules4243#### Assumptions Placement44- Place ALL assumptions (growth rates, margins, multiples, etc.) in separate assumption cells45- Use cell references instead of hardcoded values in formulas46- Example: Use =B5*(1+$B$6) instead of =B5*1.054748#### Formula Error Prevention49- Verify all cell references are correct50- Check for off-by-one errors in ranges51- Ensure consistent formulas across all projection periods52- Test with edge cases (zero values, negative numbers)53- Verify no unintended circular references5455#### Documentation Requirements for Hardcodes56- Comment or in cells beside (if end of table). Format: "Source: [System/Document], [Date], [Specific Reference], [URL if applicable]"57- Examples:58 - "Source: Company 10-K, FY2024, Page 45, Revenue Note, [SEC EDGAR URL]"59 - "Source: Company 10-Q, Q2 2025, Exhibit 99.1, [SEC EDGAR URL]"60 - "Source: Bloomberg Terminal, 8/15/2025, AAPL US Equity"61 - "Source: FactSet, 8/20/2025, Consensus Estimates Screen"6263# XLSX creation, editing, and analysis6465## Overview6667A user may ask you to create, edit, or analyze the contents of an .xlsx file. You have different tools and workflows available for different tasks.6869## Important Requirements7071**LibreOffice Required for Formula Recalculation**: You can assume LibreOffice is installed for recalculating formula values using the `recalc.py` script. The script automatically configures LibreOffice on first run7273## Reading and analyzing data7475### Data analysis with pandas76For data analysis, visualization, and basic operations, use **pandas** which provides powerful data manipulation capabilities:7778```python79import pandas as pd8081# Read Excel82df = pd.read_excel('file.xlsx') # Default: first sheet83all_sheets = pd.read_excel('file.xlsx', sheet_name=None) # All sheets as dict8485# Analyze86df.head() # Preview data87df.info() # Column info88df.describe() # Statistics8990# Write Excel91df.to_excel('output.xlsx', index=False)92```9394## Excel File Workflows9596## CRITICAL: Use Formulas, Not Hardcoded Values9798**Always use Excel formulas instead of calculating values in Python and hardcoding them.** This ensures the spreadsheet remains dynamic and updateable.99100### ❌ WRONG - Hardcoding Calculated Values101```python102# Bad: Calculating in Python and hardcoding result103total = df['Sales'].sum()104sheet['B10'] = total # Hardcodes 5000105106# Bad: Computing growth rate in Python107growth = (df.iloc[-1]['Revenue'] - df.iloc[0]['Revenue']) / df.iloc[0]['Revenue']108sheet['C5'] = growth # Hardcodes 0.15109110# Bad: Python calculation for average111avg = sum(values) / len(values)112sheet['D20'] = avg # Hardcodes 42.5113```114115### ✅ CORRECT - Using Excel Formulas116```python117# Good: Let Excel calculate the sum118sheet['B10'] = '=SUM(B2:B9)'119120# Good: Growth rate as Excel formula121sheet['C5'] = '=(C4-C2)/C2'122123# Good: Average using Excel function124sheet['D20'] = '=AVERAGE(D2:D19)'125```126127This applies to ALL calculations - totals, percentages, ratios, differences, etc. The spreadsheet should be able to recalculate when source data changes.128129## Common Workflow1301. **Choose tool**: pandas for data, openpyxl for formulas/formatting1312. **Create/Load**: Create new workbook or load existing file1323. **Modify**: Add/edit data, formulas, and formatting1334. **Save**: Write to file1345. **Recalculate formulas (MANDATORY IF USING FORMULAS)**: Use the recalc.py script135 ```bash136 python recalc.py output.xlsx137 ```1386. **Verify and fix any errors**:139 - The script returns JSON with error details140 - If `status` is `errors_found`, check `error_summary` for specific error types and locations141 - Fix the identified errors and recalculate again142 - Common errors to fix:143 - `#REF!`: Invalid cell references144 - `#DIV/0!`: Division by zero145 - `#VALUE!`: Wrong data type in formula146 - `#NAME?`: Unrecognized formula name147148### Creating new Excel files149150```python151# Using openpyxl for formulas and formatting152from openpyxl import Workbook153from openpyxl.styles import Font, PatternFill, Alignment154155wb = Workbook()156sheet = wb.active157158# Add data159sheet['A1'] = 'Hello'160sheet['B1'] = 'World'161sheet.append(['Row', 'of', 'data'])162163# Add formula164sheet['B2'] = '=SUM(A1:A10)'165166# Formatting167sheet['A1'].font = Font(bold=True, color='FF0000')168sheet['A1'].fill = PatternFill('solid', start_color='FFFF00')169sheet['A1'].alignment = Alignment(horizontal='center')170171# Column width172sheet.column_dimensions['A'].width = 20173174wb.save('output.xlsx')175```176177### Editing existing Excel files178179```python180# Using openpyxl to preserve formulas and formatting181from openpyxl import load_workbook182183# Load existing file184wb = load_workbook('existing.xlsx')185sheet = wb.active # or wb['SheetName'] for specific sheet186187# Working with multiple sheets188for sheet_name in wb.sheetnames:189 sheet = wb[sheet_name]190 print(f"Sheet: {sheet_name}")191192# Modify cells193sheet['A1'] = 'New Value'194sheet.insert_rows(2) # Insert row at position 2195sheet.delete_cols(3) # Delete column 3196197# Add new sheet198new_sheet = wb.create_sheet('NewSheet')199new_sheet['A1'] = 'Data'200201wb.save('modified.xlsx')202```203204## Recalculating formulas205206Excel files created or modified by openpyxl contain formulas as strings but not calculated values. Use the provided `recalc.py` script to recalculate formulas:207208```bash209python recalc.py <excel_file> [timeout_seconds]210```211212Example:213```bash214python recalc.py output.xlsx 30215```216217The script:218- Automatically sets up LibreOffice macro on first run219- Recalculates all formulas in all sheets220- Scans ALL cells for Excel errors (#REF!, #DIV/0!, etc.)221- Returns JSON with detailed error locations and counts222- Works on both Linux and macOS223224## Formula Verification Checklist225226Quick checks to ensure formulas work correctly:227228### Essential Verification229- [ ] **Test 2-3 sample references**: Verify they pull correct values before building full model230- [ ] **Column mapping**: Confirm Excel columns match (e.g., column 64 = BL, not BK)231- [ ] **Row offset**: Remember Excel rows are 1-indexed (DataFrame row 5 = Excel row 6)232233### Common Pitfalls234- [ ] **NaN handling**: Check for null values with `pd.notna()`235- [ ] **Far-right columns**: FY data often in columns 50+236- [ ] **Multiple matches**: Search all occurrences, not just first237- [ ] **Division by zero**: Check denominators before using `/` in formulas (#DIV/0!)238- [ ] **Wrong references**: Verify all cell references point to intended cells (#REF!)239- [ ] **Cross-sheet references**: Use correct format (Sheet1!A1) for linking sheets240241### Formula Testing Strategy242- [ ] **Start small**: Test formulas on 2-3 cells before applying broadly243- [ ] **Verify dependencies**: Check all cells referenced in formulas exist244- [ ] **Test edge cases**: Include zero, negative, and very large values245246### Interpreting recalc.py Output247The script returns JSON with error details:248```json249{250 "status": "success", // or "errors_found"251 "total_errors": 0, // Total error count252 "total_formulas": 42, // Number of formulas in file253 "error_summary": { // Only present if errors found254 "#REF!": {255 "count": 2,256 "locations": ["Sheet1!B5", "Sheet1!C10"]257 }258 }259}260```261262## Best Practices263264### Library Selection265- **pandas**: Best for data analysis, bulk operations, and simple data export266- **openpyxl**: Best for complex formatting, formulas, and Excel-specific features267268### Working with openpyxl269- Cell indices are 1-based (row=1, column=1 refers to cell A1)270- Use `data_only=True` to read calculated values: `load_workbook('file.xlsx', data_only=True)`271- **Warning**: If opened with `data_only=True` and saved, formulas are replaced with values and permanently lost272- For large files: Use `read_only=True` for reading or `write_only=True` for writing273- Formulas are preserved but not evaluated - use recalc.py to update values274275### Working with pandas276- Specify data types to avoid inference issues: `pd.read_excel('file.xlsx', dtype={'id': str})`277- For large files, read specific columns: `pd.read_excel('file.xlsx', usecols=['A', 'C', 'E'])`278- Handle dates properly: `pd.read_excel('file.xlsx', parse_dates=['date_column'])`279280## Code Style Guidelines281**IMPORTANT**: When generating Python code for Excel operations:282- Write minimal, concise Python code without unnecessary comments283- Avoid verbose variable names and redundant operations284- Avoid unnecessary print statements285286**For Excel files themselves**:287- Add comments to cells with complex formulas or important assumptions288- Document data sources for hardcoded values289- Include notes for key calculations and model sections
Full transparency — inspect the skill content before installing.