This is page 1 of 2. Use http://codebase.md/gongrzhe/office-word-mcp-server?page={x} to view the full context.
# Directory Structure
```
├── __init__.py
├── .gitignore
├── Dockerfile
├── LICENSE
├── mcp-config.json
├── office_word_mcp_server
│ └── __init__.py
├── pyproject.toml
├── README.md
├── RENDER_DEPLOYMENT.md
├── requirements.txt
├── setup_mcp.py
├── smithery.yaml
├── test_formatting.py
├── tests
│ └── test_convert_to_pdf.py
├── uv.lock
├── word_document_server
│ ├── __init__.py
│ ├── core
│ │ ├── __init__.py
│ │ ├── comments.py
│ │ ├── footnotes.py
│ │ ├── protection.py
│ │ ├── styles.py
│ │ ├── tables.py
│ │ └── unprotect.py
│ ├── main.py
│ ├── tools
│ │ ├── __init__.py
│ │ ├── comment_tools.py
│ │ ├── content_tools.py
│ │ ├── document_tools.py
│ │ ├── extended_document_tools.py
│ │ ├── footnote_tools.py
│ │ ├── format_tools.py
│ │ └── protection_tools.py
│ └── utils
│ ├── __init__.py
│ ├── document_utils.py
│ ├── extended_document_utils.py
│ └── file_utils.py
└── word_mcp_server.py
```
# Files
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
```
# Project files
.idea
.DS_Store
# Python-generated files
__pycache__/
*.py[oc]
build/
dist/
wheels/
*.egg-info
# Virtual environments
.venv
.env.example
.idea
```
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
```markdown
# Office-Word-MCP-Server
[](https://smithery.ai/server/@GongRzhe/Office-Word-MCP-Server)
A Model Context Protocol (MCP) server for creating, reading, and manipulating Microsoft Word documents. This server enables AI assistants to work with Word documents through a standardized interface, providing rich document editing capabilities.
<a href="https://glama.ai/mcp/servers/@GongRzhe/Office-Word-MCP-Server">
<img width="380" height="200" src="https://glama.ai/mcp/servers/@GongRzhe/Office-Word-MCP-Server/badge" alt="Office Word Server MCP server" />
</a>

## Overview
Office-Word-MCP-Server implements the [Model Context Protocol](https://modelcontextprotocol.io/) to expose Word document operations as tools and resources. It serves as a bridge between AI assistants and Microsoft Word documents, allowing for document creation, content addition, formatting, and analysis.
The server features a modular architecture that separates concerns into core functionality, tools, and utilities, making it highly maintainable and extensible for future enhancements.
### Example
#### Pormpt

#### Output

## Features
### Document Management
- Create new Word documents with metadata
- Extract text and analyze document structure
- View document properties and statistics
- List available documents in a directory
- Create copies of existing documents
- Merge multiple documents into a single document
- Convert Word documents to PDF format
### Content Creation
- Add headings with different levels and direct formatting (font, size, bold, italic, borders)
- Insert paragraphs with optional styling and direct formatting (font, size, bold, italic, color)
- Create tables with custom data
- Add images with proportional scaling
- Insert page breaks
- Insert bulleted and numbered lists with proper XML formatting
- Add footnotes and endnotes to documents
- Convert footnotes to endnotes
- Customize footnote and endnote styling
- Create professional table layouts for technical documentation
- Design callout boxes and formatted content for instructional materials
- Build structured data tables for business reports with consistent styling
- Insert content relative to existing text or paragraph indices
### Rich Text Formatting
- Format specific text sections (bold, italic, underline)
- Change text color and font properties
- Apply custom styles to text elements
- Search and replace text throughout documents
- Individual cell text formatting within tables
- Multiple formatting combinations for enhanced visual appeal
- Font customization with family and size control
- Direct formatting during content creation (paragraphs and headings)
- Reduce function calls by combining content creation with formatting
- Add section header borders for visual separation
### Table Formatting
- Format tables with borders and styles
- Create header rows with distinct formatting
- Apply cell shading and custom borders
- Structure tables for better readability
- Individual cell background shading with color support
- Alternating row colors for improved readability
- Enhanced header row highlighting with custom colors
- Cell text formatting with bold, italic, underline, color, font size, and font family
- Comprehensive color support with named colors and hex color codes
- Cell padding management with independent control of all sides
- Cell alignment (horizontal and vertical positioning)
- Cell merging (horizontal, vertical, and rectangular areas)
- Column width management with multiple units (points, percentage, auto-fit)
- Auto-fit capabilities for dynamic column sizing
- Professional callout table support with icon cells and styled content
### Advanced Document Manipulation
- Delete paragraphs
- Insert content relative to specific text or paragraph indices
- Insert bulleted and numbered lists with proper XML numbering structure
- Insert headers and paragraphs before or after target locations
- Create custom document styles
- Apply consistent formatting throughout documents
- Format specific ranges of text with detailed control
- Flexible padding units with support for points and percentage-based measurements
- Clear, readable table presentation with proper alignment and spacing
### Document Protection
- Add password protection to documents
- Implement restricted editing with editable sections
- Add digital signatures to documents
- Verify document authenticity and integrity
### Comment Extraction
- Extract all comments from a document
- Filter comments by author
- Get comments for specific paragraphs
- Access comment metadata (author, date, text)
## Installation
### Installing via Smithery
To install Office Word Document Server for Claude Desktop automatically via [Smithery](https://smithery.ai/server/@GongRzhe/Office-Word-MCP-Server):
```bash
npx -y @smithery/cli install @GongRzhe/Office-Word-MCP-Server --client claude
```
### Prerequisites
- Python 3.8 or higher
- pip package manager
### Basic Installation
```bash
# Clone the repository
git clone https://github.com/GongRzhe/Office-Word-MCP-Server.git
cd Office-Word-MCP-Server
# Install dependencies
pip install -r requirements.txt
```
### Using the Setup Script
Alternatively, you can use the provided setup script which handles:
- Checking prerequisites
- Setting up a virtual environment
- Installing dependencies
- Generating MCP configuration
```bash
python setup_mcp.py
```
## Usage with Claude for Desktop
### Configuration
#### Method 1: After Local Installation
1. After installation, add the server to your Claude for Desktop configuration file:
```json
{
"mcpServers": {
"word-document-server": {
"command": "python",
"args": ["/path/to/word_mcp_server.py"]
}
}
}
```
#### Method 2: Without Installation (Using uvx)
1. You can also configure Claude for Desktop to use the server without local installation by using the uvx package manager:
```json
{
"mcpServers": {
"word-document-server": {
"command": "uvx",
"args": ["--from", "office-word-mcp-server", "word_mcp_server"]
}
}
}
```
2. Configuration file locations:
- macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
- Windows: `%APPDATA%\Claude\claude_desktop_config.json`
3. Restart Claude for Desktop to load the configuration.
### Example Operations
Once configured, you can ask Claude to perform operations like:
- "Create a new document called 'report.docx' with a title page"
- "Add a heading and three paragraphs to my document"
- "Add my name in Helvetica 36pt bold at the top of the document"
- "Add a section heading 'Summary' in Helvetica 14pt bold with a bottom border"
- "Add a paragraph in Times New Roman 14pt with italic blue text"
- "Insert a bulleted list after the paragraph containing 'Introduction'"
- "Insert a numbered list with items: 'First step', 'Second step', 'Third step'"
- "Add bullet points after the 'Summary' heading"
- "Insert a 4x4 table with sales data"
- "Format the word 'important' in paragraph 2 to be bold and red"
- "Search and replace all instances of 'old term' with 'new term'"
- "Create a custom style for section headings"
- "Apply formatting to the table in my document"
- "Extract all comments from my document"
- "Show me all comments by John Doe"
- "Get comments for paragraph 3"
- "Make the text in table cell (1,2) bold and blue with 14pt font"
- "Add 10 points of padding to all sides of the header cells"
- "Create a callout table with a blue checkmark icon and white text"
- "Set the first column width to 50 points and auto-fit the remaining columns"
- "Apply alternating row colors to make the table more readable"
## API Reference
### Document Creation and Properties
```python
create_document(filename, title=None, author=None)
get_document_info(filename)
get_document_text(filename)
get_document_outline(filename)
list_available_documents(directory=".")
copy_document(source_filename, destination_filename=None)
convert_to_pdf(filename, output_filename=None)
```
### Content Addition
```python
add_heading(filename, text, level=1, font_name=None, font_size=None,
bold=None, italic=None, border_bottom=False)
add_paragraph(filename, text, style=None, font_name=None, font_size=None,
bold=None, italic=None, color=None)
add_table(filename, rows, cols, data=None)
add_picture(filename, image_path, width=None)
add_page_break(filename)
```
### Advanced Content Manipulation
```python
# Insert content relative to existing text or paragraph index
insert_header_near_text(filename, target_text=None, header_title=None,
position='after', header_style='Heading 1',
target_paragraph_index=None)
insert_line_or_paragraph_near_text(filename, target_text=None, line_text=None,
position='after', line_style=None,
target_paragraph_index=None)
# Insert bulleted or numbered lists with proper XML formatting
insert_numbered_list_near_text(filename, target_text=None, list_items=None,
position='after', target_paragraph_index=None,
bullet_type='bullet')
# bullet_type options:
# 'bullet' - Creates bulleted list with bullets (•)
# 'number' - Creates numbered list (1, 2, 3, ...)
```
### Content Extraction
```python
get_document_text(filename)
get_paragraph_text_from_document(filename, paragraph_index)
find_text_in_document(filename, text_to_find, match_case=True, whole_word=False)
```
### Text Formatting
```python
format_text(filename, paragraph_index, start_pos, end_pos, bold=None,
italic=None, underline=None, color=None, font_size=None, font_name=None)
search_and_replace(filename, find_text, replace_text)
delete_paragraph(filename, paragraph_index)
create_custom_style(filename, style_name, bold=None, italic=None,
font_size=None, font_name=None, color=None, base_style=None)
```
### Table Formatting
```python
format_table(filename, table_index, has_header_row=None,
border_style=None, shading=None)
set_table_cell_shading(filename, table_index, row_index, col_index,
fill_color, pattern="clear")
apply_table_alternating_rows(filename, table_index,
color1="FFFFFF", color2="F2F2F2")
highlight_table_header(filename, table_index,
header_color="4472C4", text_color="FFFFFF")
# Cell merging tools
merge_table_cells(filename, table_index, start_row, start_col, end_row, end_col)
merge_table_cells_horizontal(filename, table_index, row_index, start_col, end_col)
merge_table_cells_vertical(filename, table_index, col_index, start_row, end_row)
# Cell alignment tools
set_table_cell_alignment(filename, table_index, row_index, col_index,
horizontal="left", vertical="top")
set_table_alignment_all(filename, table_index,
horizontal="left", vertical="top")
# Cell text formatting tools
format_table_cell_text(filename, table_index, row_index, col_index,
text_content=None, bold=None, italic=None, underline=None,
color=None, font_size=None, font_name=None)
# Cell padding tools
set_table_cell_padding(filename, table_index, row_index, col_index,
top=None, bottom=None, left=None, right=None, unit="points")
# Column width management
set_table_column_width(filename, table_index, col_index, width, width_type="points")
set_table_column_widths(filename, table_index, widths, width_type="points")
set_table_width(filename, table_index, width, width_type="points")
auto_fit_table_columns(filename, table_index)
```
### Comment Extraction
```python
get_all_comments(filename)
get_comments_by_author(filename, author)
get_comments_for_paragraph(filename, paragraph_index)
```
## Troubleshooting
### Common Issues
1. **Missing Styles**
- Some documents may lack required styles for heading and table operations
- The server will attempt to create missing styles or use direct formatting
- For best results, use templates with standard Word styles
2. **Permission Issues**
- Ensure the server has permission to read/write to the document paths
- Use the `copy_document` function to create editable copies of locked documents
- Check file ownership and permissions if operations fail
3. **Image Insertion Problems**
- Use absolute paths for image files
- Verify image format compatibility (JPEG, PNG recommended)
- Check image file size and permissions
4. **Table Formatting Issues**
- **Cell index errors**: Ensure row and column indices are within table bounds (0-based indexing)
- **Color format problems**: Use hex colors without '#' prefix (e.g., "FF0000" for red) or standard color names
- **Padding unit confusion**: Specify "points" or "percent" explicitly when setting cell padding
- **Column width conflicts**: Auto-fit may override manual column width settings
- **Text formatting persistence**: Apply cell text formatting after setting cell content for best results
### Debugging
Enable detailed logging by setting the environment variable:
```bash
export MCP_DEBUG=1 # Linux/macOS
set MCP_DEBUG=1 # Windows
```
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Acknowledgments
- [Model Context Protocol](https://modelcontextprotocol.io/) for the protocol specification
- [python-docx](https://python-docx.readthedocs.io/) for Word document manipulation
- [FastMCP](https://github.com/modelcontextprotocol/python-sdk) for the Python MCP implementation
---
_Note: This server interacts with document files on your system. Always verify that requested operations are appropriate before confirming them in Claude for Desktop or other MCP clients._
```
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
```
fastmcp
python-docx
msoffcrypto-tool
docx2pdf
python-dotenv
```
--------------------------------------------------------------------------------
/office_word_mcp_server/__init__.py:
--------------------------------------------------------------------------------
```python
from word_document_server.main import run_server
__all__ = ["run_server"]
```
--------------------------------------------------------------------------------
/__init__.py:
--------------------------------------------------------------------------------
```python
"""Office Word MCP Server package entry point."""
from word_document_server.main import run_server
__all__ = ["run_server"]
```
--------------------------------------------------------------------------------
/word_mcp_server.py:
--------------------------------------------------------------------------------
```python
#!/usr/bin/env python3
"""
Run script for the Word Document Server.
This script provides a simple way to start the Word Document Server.
"""
from word_document_server.main import run_server
if __name__ == "__main__":
run_server()
```
--------------------------------------------------------------------------------
/mcp-config.json:
--------------------------------------------------------------------------------
```json
{
"mcpServers": {
"word-document-server": {
"command": "/Users/gongzhe/GitRepos/Office-Word-MCP-Server/.venv/bin/python",
"args": [
"/Users/gongzhe/GitRepos/Office-Word-MCP-Server/word_mcp_server.py"
],
"env": {
"PYTHONPATH": "/Users/gongzhe/GitRepos/Office-Word-MCP-Server",
"MCP_TRANSPORT": "stdio"
}
}
}
}
```
--------------------------------------------------------------------------------
/word_document_server/utils/__init__.py:
--------------------------------------------------------------------------------
```python
"""
Utility functions for the Word Document Server.
This package contains utility modules for file operations and document handling.
"""
from word_document_server.utils.file_utils import check_file_writeable, create_document_copy, ensure_docx_extension
from word_document_server.utils.document_utils import get_document_properties, extract_document_text, get_document_structure, find_paragraph_by_text, find_and_replace_text
```
--------------------------------------------------------------------------------
/smithery.yaml:
--------------------------------------------------------------------------------
```yaml
# Smithery configuration file: https://smithery.ai/docs/build/project-config
startCommand:
type: stdio
configSchema:
# JSON Schema defining the configuration options for the MCP.
type: object
description: No configuration options required
commandFunction:
# A JS function that produces the CLI command based on the given config to start the MCP on stdio.
|-
(config) => ({command:'word_mcp_server', args:[]})
exampleConfig: {}
```
--------------------------------------------------------------------------------
/word_document_server/__init__.py:
--------------------------------------------------------------------------------
```python
"""
Word Document Server - MCP server for Microsoft Word document manipulation.
This package provides tools for creating, reading, and manipulating Microsoft Word
documents through the Model Context Protocol (MCP).
Features:
- Document creation and management
- Content addition (headings, paragraphs, tables, images)
- Text and table formatting
- Document protection (password, restricted editing, signatures)
- Footnote and endnote management
"""
__version__ = "1.0.0"
```
--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------
```dockerfile
# Generated by https://smithery.ai. See: https://smithery.ai/docs/build/project-config
# syntax=docker/dockerfile:1
# Use official Python runtime
FROM python:3.11-slim
# Set working directory
WORKDIR /app
# Install build dependencies
RUN apt-get update \
&& apt-get install -y --no-install-recommends build-essential \
&& rm -rf /var/lib/apt/lists/*
# Copy project files
COPY . /app
# Install Python dependencies
RUN pip install --no-cache-dir .
# Default command
ENTRYPOINT ["word_mcp_server"]
```
--------------------------------------------------------------------------------
/word_document_server/core/__init__.py:
--------------------------------------------------------------------------------
```python
"""
Core functionality for the Word Document Server.
This package contains the core functionality modules used by the Word Document Server.
"""
from word_document_server.core.styles import ensure_heading_style, ensure_table_style, create_style
from word_document_server.core.protection import add_protection_info, verify_document_protection, is_section_editable, create_signature_info, verify_signature
from word_document_server.core.footnotes import add_footnote, add_endnote, convert_footnotes_to_endnotes, find_footnote_references, get_format_symbols, customize_footnote_formatting
from word_document_server.core.tables import set_cell_border, apply_table_style, copy_table
```
--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
```toml
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "office-word-mcp-server"
version = "1.1.10"
description = "MCP server for manipulating Microsoft Word documents"
readme = "README.md"
license = {file = "LICENSE"}
authors = [
{name = "GongRzhe", email = "[email protected]"}
]
classifiers = [
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
]
requires-python = ">=3.11"
dependencies = [
"python-docx>=1.1.2",
"fastmcp>=2.8.1",
"msoffcrypto-tool>=5.4.2",
"docx2pdf>=0.1.8",
"pytest>=8.4.2",
]
[project.urls]
"Homepage" = "https://github.com/GongRzhe/Office-Word-MCP-Server.git"
"Bug Tracker" = "https://github.com/GongRzhe/Office-Word-MCP-Server.git/issues"
[tool.hatch.build.targets.wheel]
only-include = [
"word_document_server",
"office_word_mcp_server",
]
sources = ["."]
[project.scripts]
word_mcp_server = "word_document_server.main:run_server"
```
--------------------------------------------------------------------------------
/word_document_server/tools/__init__.py:
--------------------------------------------------------------------------------
```python
"""
MCP tool implementations for the Word Document Server.
This package contains the MCP tool implementations that expose functionality
to clients through the Model Context Protocol.
"""
# Document tools
from word_document_server.tools.document_tools import (
create_document, get_document_info, get_document_text,
get_document_outline, list_available_documents,
copy_document, merge_documents
)
# Content tools
from word_document_server.tools.content_tools import (
add_heading, add_paragraph, add_table, add_picture,
add_page_break, add_table_of_contents, delete_paragraph,
search_and_replace
)
# Format tools
from word_document_server.tools.format_tools import (
format_text, create_custom_style, format_table
)
# Protection tools
from word_document_server.tools.protection_tools import (
protect_document, add_restricted_editing,
add_digital_signature, verify_document
)
# Footnote tools
from word_document_server.tools.footnote_tools import (
add_footnote_to_document, add_endnote_to_document,
convert_footnotes_to_endnotes_in_document, customize_footnote_style
)
# Comment tools
from word_document_server.tools.comment_tools import (
get_all_comments, get_comments_by_author, get_comments_for_paragraph
)
```
--------------------------------------------------------------------------------
/RENDER_DEPLOYMENT.md:
--------------------------------------------------------------------------------
```markdown
# Render Deployment Guide
This document explains how to deploy the Office Word MCP Server on Render.
## Required Environment Variables
Set the following environment variables in your Render service:
### `MCP_TRANSPORT`
- **Value**: `sse`
- **Description**: Sets the transport type to Server-Sent Events (SSE) for HTTP communication
- **Required**: Yes (for Render deployment)
### `MCP_HOST`
- **Value**: `0.0.0.0`
- **Description**: Binds the server to all network interfaces
- **Required**: No (defaults to 0.0.0.0)
### `FASTMCP_LOG_LEVEL`
- **Value**: `INFO`
- **Description**: Sets the logging level for FastMCP
- **Required**: No (defaults to INFO)
## How to Set Environment Variables
1. Go to your Render dashboard: https://dashboard.render.com
2. Navigate to your service: `Office-Word-MCP-Server`
3. Click on "Environment" in the left sidebar
4. Add the environment variable:
- Key: `MCP_TRANSPORT`
- Value: `sse`
5. Click "Save Changes"
## Deployment
After setting the environment variables:
1. Render will automatically redeploy your service
2. The server will start with SSE transport on the port provided by Render
3. Access your server at: `https://office-word-mcp-server-bzlp.onrender.com/sse`
## Health Check Endpoint
The FastMCP server with SSE transport automatically provides a health check endpoint at:
- `https://your-service.onrender.com/health`
## Troubleshooting
### Server exits with status 1
- **Cause**: Server is running in STDIO mode instead of SSE
- **Fix**: Ensure `MCP_TRANSPORT=sse` is set in environment variables
### Port binding errors
- **Cause**: Server not using Render's PORT environment variable
- **Fix**: This has been fixed in the latest version of main.py
### Cannot connect to server
- **Cause**: Health checks failing
- **Fix**: Ensure SSE transport is enabled and server is listening on 0.0.0.0
```
--------------------------------------------------------------------------------
/word_document_server/utils/file_utils.py:
--------------------------------------------------------------------------------
```python
"""
File utility functions for Word Document Server.
"""
import os
from typing import Tuple, Optional
import shutil
def check_file_writeable(filepath: str) -> Tuple[bool, str]:
"""
Check if a file can be written to.
Args:
filepath: Path to the file
Returns:
Tuple of (is_writeable, error_message)
"""
# If file doesn't exist, check if directory is writeable
if not os.path.exists(filepath):
directory = os.path.dirname(filepath)
# If no directory is specified (empty string), use current directory
if directory == '':
directory = '.'
if not os.path.exists(directory):
return False, f"Directory {directory} does not exist"
if not os.access(directory, os.W_OK):
return False, f"Directory {directory} is not writeable"
return True, ""
# If file exists, check if it's writeable
if not os.access(filepath, os.W_OK):
return False, f"File {filepath} is not writeable (permission denied)"
# Try to open the file for writing to see if it's locked
try:
with open(filepath, 'a'):
pass
return True, ""
except IOError as e:
return False, f"File {filepath} is not writeable: {str(e)}"
except Exception as e:
return False, f"Unknown error checking file permissions: {str(e)}"
def create_document_copy(source_path: str, dest_path: Optional[str] = None) -> Tuple[bool, str, Optional[str]]:
"""
Create a copy of a document.
Args:
source_path: Path to the source document
dest_path: Optional path for the new document. If not provided, will use source_path + '_copy.docx'
Returns:
Tuple of (success, message, new_filepath)
"""
if not os.path.exists(source_path):
return False, f"Source document {source_path} does not exist", None
if not dest_path:
# Generate a new filename if not provided
base, ext = os.path.splitext(source_path)
dest_path = f"{base}_copy{ext}"
try:
# Simple file copy
shutil.copy2(source_path, dest_path)
return True, f"Document copied to {dest_path}", dest_path
except Exception as e:
return False, f"Failed to copy document: {str(e)}", None
def ensure_docx_extension(filename: str) -> str:
"""
Ensure filename has .docx extension.
Args:
filename: The filename to check
Returns:
Filename with .docx extension
"""
if not filename.endswith('.docx'):
return filename + '.docx'
return filename
```
--------------------------------------------------------------------------------
/word_document_server/core/unprotect.py:
--------------------------------------------------------------------------------
```python
"""
Unprotect document functionality for the Word Document Server.
This module handles removing document protection.
"""
import os
import json
import hashlib
import tempfile
import shutil
from typing import Tuple, Optional
def remove_protection_info(filename: str, password: Optional[str] = None) -> Tuple[bool, str]:
"""
Remove protection information from a document and decrypt it if necessary.
Args:
filename: Path to the Word document
password: Password to verify before removing protection
Returns:
Tuple of (success, message)
"""
base_path, _ = os.path.splitext(filename)
metadata_path = f"{base_path}.protection"
# Check if protection metadata exists
if not os.path.exists(metadata_path):
return False, "Document is not protected"
try:
# Load protection data
with open(metadata_path, 'r') as f:
protection_data = json.load(f)
# Verify password if provided and required
if password and protection_data.get("password_hash"):
password_hash = hashlib.sha256(password.encode()).hexdigest()
if password_hash != protection_data.get("password_hash"):
return False, "Incorrect password"
# Handle true encryption if it was applied
if protection_data.get("true_encryption") and password:
try:
import msoffcrypto
# Create a temporary file for the decrypted output
temp_fd, temp_path = tempfile.mkstemp(suffix='.docx')
os.close(temp_fd)
# Open the encrypted document
with open(filename, 'rb') as f:
office_file = msoffcrypto.OfficeFile(f)
# Decrypt with provided password
try:
office_file.load_key(password=password)
# Write the decrypted file to the temp path
with open(temp_path, 'wb') as out_file:
office_file.decrypt(out_file)
# Replace encrypted file with decrypted version
shutil.move(temp_path, filename)
except Exception as decrypt_error:
if os.path.exists(temp_path):
os.unlink(temp_path)
return False, f"Failed to decrypt document: {str(decrypt_error)}"
except ImportError:
return False, "Missing msoffcrypto package required for encryption/decryption"
except Exception as e:
return False, f"Error decrypting document: {str(e)}"
# Remove the protection metadata file
os.remove(metadata_path)
return True, "Protection removed successfully"
except Exception as e:
return False, f"Error removing protection: {str(e)}"
```
--------------------------------------------------------------------------------
/test_formatting.py:
--------------------------------------------------------------------------------
```python
"""
Test script for add_paragraph and add_heading formatting parameters.
"""
import asyncio
from docx import Document
from word_document_server.tools.content_tools import add_paragraph, add_heading
from word_document_server.tools.document_tools import create_document
async def test_formatting():
"""Test the new formatting parameters."""
test_doc = 'test_formatting.docx'
# Create test document
print("Creating test document...")
await create_document(test_doc, title="Formatting Test", author="Test Script")
# Test 1: Name with large font
print("Test 1: Adding name with large Helvetica 36pt bold...")
result = await add_paragraph(
test_doc,
"JAMES MEHORTER",
font_name="Helvetica",
font_size=36,
bold=True
)
print(f" Result: {result}")
# Test 2: Title line
print("Test 2: Adding title with Helvetica 14pt...")
result = await add_paragraph(
test_doc,
"Principal Software Engineer | Technical Team Lead",
font_name="Helvetica",
font_size=14
)
print(f" Result: {result}")
# Test 3: Section header with border
print("Test 3: Adding section header with border...")
result = await add_heading(
test_doc,
"PROFESSIONAL SUMMARY",
level=2,
font_name="Helvetica",
font_size=14,
bold=True,
border_bottom=True
)
print(f" Result: {result}")
# Test 4: Body text in Times New Roman
print("Test 4: Adding body text in Times New Roman 14pt...")
result = await add_paragraph(
test_doc,
"This is body text that should be in Times New Roman at 14pt. "
"It demonstrates the ability to apply different fonts to different paragraphs.",
font_name="Times New Roman",
font_size=14
)
print(f" Result: {result}")
# Test 5: Another section header
print("Test 5: Adding another section header with border...")
result = await add_heading(
test_doc,
"SKILLS",
level=2,
font_name="Helvetica",
font_size=14,
bold=True,
border_bottom=True
)
print(f" Result: {result}")
# Test 6: Italic text with color
print("Test 6: Adding italic text with color...")
result = await add_paragraph(
test_doc,
"This text is italic and colored blue.",
font_name="Arial",
font_size=12,
italic=True,
color="0000FF"
)
print(f" Result: {result}")
print(f"\n✅ Test document created: {test_doc}")
# Verify formatting
print("\nVerifying formatting...")
verify_doc = Document(test_doc)
for i, para in enumerate(verify_doc.paragraphs):
if para.runs:
run = para.runs[0]
text_preview = para.text[:50] + "..." if len(para.text) > 50 else para.text
print(f"\nParagraph {i}: {text_preview}")
print(f" Font: {run.font.name}")
print(f" Size: {run.font.size}")
print(f" Bold: {run.font.bold}")
print(f" Italic: {run.font.italic}")
print("\n✅ All tests completed successfully!")
print(f"Open {test_doc} in Word to verify the formatting visually.")
if __name__ == "__main__":
asyncio.run(test_formatting())
```
--------------------------------------------------------------------------------
/tests/test_convert_to_pdf.py:
--------------------------------------------------------------------------------
```python
import asyncio
from pathlib import Path
import pytest
from docx import Document
# Target for testing: convert_to_pdf (async function)
from word_document_server.tools.extended_document_tools import convert_to_pdf
def _make_sample_docx(path: Path) -> None:
"""Generates a simple .docx file in a temporary directory."""
doc = Document()
doc.add_heading("Conversion Test Document", level=1)
doc.add_paragraph("This is a test paragraph for PDF conversion. Contains ASCII too.")
doc.add_paragraph("Second paragraph: Contains special characters and spaces to cover path/content edge cases.")
doc.save(path)
def test_convert_to_pdf_with_temp_docx(tmp_path: Path):
"""
End-to-end test: Create a temporary .docx -> call convert_to_pdf -> validate the PDF output.
Notes:
- On Linux/macOS, it first tries LibreOffice (soffice/libreoffice),
and falls back to docx2pdf on failure (requires Microsoft Word).
- If these tools are missing or the command is unavailable, the test is skipped with a reason.
"""
# 1) Generate a docx file with spaces in its name in the temp directory
src_doc = tmp_path / "sample document with spaces.docx"
_make_sample_docx(src_doc)
# 2) Define the output PDF path (also in the temp directory)
out_pdf = tmp_path / "converted output.pdf"
# 3) Run the asynchronous function under test
result_msg = asyncio.run(convert_to_pdf(str(src_doc), output_filename=str(out_pdf)))
# 4) Success condition: the return message contains success keywords, or the target PDF exists
success_keywords = ["successfully converted", "converted to PDF"]
success = any(k.lower() in result_msg.lower() for k in success_keywords) or out_pdf.exists()
if not success:
# When LibreOffice or Microsoft Word is not installed, the tool returns a hint.
# In this case, skip the test instead of failing.
pytest.skip(f"PDF conversion tool unavailable or conversion failed: {result_msg}")
# 5) Assert: The PDF file was generated and is not empty
# Some environments (especially docx2pdf) might ignore the exact output filename
# and just generate a PDF with the same name as the source in the output or source directory,
# so we check multiple possible locations.
candidates = [
out_pdf,
# Common: A PDF with the same name as the source file in the output directory
out_pdf.parent / f"{src_doc.stem}.pdf",
# Fallback: A PDF in the same directory as the source file
src_doc.with_suffix(".pdf"),
]
# If none of the above paths exist, search for any newly generated PDF in the temp directory
found = None
for p in candidates:
if p.exists():
found = p
break
if not found:
pdfs = sorted(tmp_path.glob("*.pdf"), key=lambda p: p.stat().st_mtime, reverse=True)
if pdfs:
found = pdfs[0]
if not found:
# If the tool returns success but the output can't be found,
# treat it as an environment/tooling difference and skip instead of failing.
pytest.skip(f"Could not find the generated PDF. Function output: {result_msg}")
assert found.exists(), f"Generated PDF not found: {found}, function output: {result_msg}"
assert found.stat().st_size > 0, f"The generated PDF file is empty: {found}"
if __name__ == "__main__":
# Allow running this file directly for quick verification:
# python tests/test_convert_to_pdf.py
import sys
sys.exit(pytest.main([__file__, "-q"]))
```
--------------------------------------------------------------------------------
/word_document_server/core/styles.py:
--------------------------------------------------------------------------------
```python
"""
Style-related functions for Word Document Server.
"""
from docx.shared import Pt
from docx.enum.style import WD_STYLE_TYPE
def ensure_heading_style(doc):
"""
Ensure Heading styles exist in the document.
Args:
doc: Document object
"""
for i in range(1, 10): # Create Heading 1 through Heading 9
style_name = f'Heading {i}'
try:
# Try to access the style to see if it exists
style = doc.styles[style_name]
except KeyError:
# Create the style if it doesn't exist
try:
style = doc.styles.add_style(style_name, WD_STYLE_TYPE.PARAGRAPH)
if i == 1:
style.font.size = Pt(16)
style.font.bold = True
elif i == 2:
style.font.size = Pt(14)
style.font.bold = True
else:
style.font.size = Pt(12)
style.font.bold = True
except Exception:
# If style creation fails, we'll just use default formatting
pass
def ensure_table_style(doc):
"""
Ensure Table Grid style exists in the document.
Args:
doc: Document object
"""
try:
# Try to access the style to see if it exists
style = doc.styles['Table Grid']
except KeyError:
# If style doesn't exist, we'll handle it at usage time
pass
def create_style(doc, style_name, style_type, base_style=None, font_properties=None, paragraph_properties=None):
"""
Create a new style in the document.
Args:
doc: Document object
style_name: Name for the new style
style_type: Type of style (WD_STYLE_TYPE)
base_style: Optional base style to inherit from
font_properties: Dictionary of font properties (bold, italic, size, name, color)
paragraph_properties: Dictionary of paragraph properties (alignment, spacing)
Returns:
The created style
"""
from docx.shared import Pt
try:
# Check if style already exists
style = doc.styles.get_by_id(style_name, WD_STYLE_TYPE.PARAGRAPH)
return style
except:
# Create new style
new_style = doc.styles.add_style(style_name, style_type)
# Set base style if specified
if base_style:
new_style.base_style = doc.styles[base_style]
# Set font properties
if font_properties:
font = new_style.font
if 'bold' in font_properties:
font.bold = font_properties['bold']
if 'italic' in font_properties:
font.italic = font_properties['italic']
if 'size' in font_properties:
font.size = Pt(font_properties['size'])
if 'name' in font_properties:
font.name = font_properties['name']
if 'color' in font_properties:
from docx.shared import RGBColor
# Define common RGB colors
color_map = {
'red': RGBColor(255, 0, 0),
'blue': RGBColor(0, 0, 255),
'green': RGBColor(0, 128, 0),
'yellow': RGBColor(255, 255, 0),
'black': RGBColor(0, 0, 0),
'gray': RGBColor(128, 128, 128),
'white': RGBColor(255, 255, 255),
'purple': RGBColor(128, 0, 128),
'orange': RGBColor(255, 165, 0)
}
color_value = font_properties['color']
try:
# Handle string color names
if isinstance(color_value, str) and color_value.lower() in color_map:
font.color.rgb = color_map[color_value.lower()]
# Handle RGBColor objects
elif hasattr(color_value, 'rgb'):
font.color.rgb = color_value
# Try to parse as RGB string
elif isinstance(color_value, str):
font.color.rgb = RGBColor.from_string(color_value)
# Use directly if it's already an RGB value
else:
font.color.rgb = color_value
except Exception as e:
# Fallback to black if all else fails
font.color.rgb = RGBColor(0, 0, 0)
# Set paragraph properties
if paragraph_properties:
if 'alignment' in paragraph_properties:
new_style.paragraph_format.alignment = paragraph_properties['alignment']
if 'spacing' in paragraph_properties:
new_style.paragraph_format.line_spacing = paragraph_properties['spacing']
return new_style
```
--------------------------------------------------------------------------------
/word_document_server/tools/comment_tools.py:
--------------------------------------------------------------------------------
```python
"""
Comment extraction tools for Word Document Server.
These tools provide high-level interfaces for extracting and analyzing
comments from Word documents through the MCP protocol.
"""
import os
import json
from typing import Dict, List, Optional, Any
from docx import Document
from word_document_server.utils.file_utils import ensure_docx_extension
from word_document_server.core.comments import (
extract_all_comments,
filter_comments_by_author,
get_comments_for_paragraph
)
async def get_all_comments(filename: str) -> str:
"""
Extract all comments from a Word document.
Args:
filename: Path to the Word document
Returns:
JSON string containing all comments with metadata
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return json.dumps({
'success': False,
'error': f'Document {filename} does not exist'
}, indent=2)
try:
# Load the document
doc = Document(filename)
# Extract all comments
comments = extract_all_comments(doc)
# Return results
return json.dumps({
'success': True,
'comments': comments,
'total_comments': len(comments)
}, indent=2)
except Exception as e:
return json.dumps({
'success': False,
'error': f'Failed to extract comments: {str(e)}'
}, indent=2)
async def get_comments_by_author(filename: str, author: str) -> str:
"""
Extract comments from a specific author in a Word document.
Args:
filename: Path to the Word document
author: Name of the comment author to filter by
Returns:
JSON string containing filtered comments
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return json.dumps({
'success': False,
'error': f'Document {filename} does not exist'
}, indent=2)
if not author or not author.strip():
return json.dumps({
'success': False,
'error': 'Author name cannot be empty'
}, indent=2)
try:
# Load the document
doc = Document(filename)
# Extract all comments
all_comments = extract_all_comments(doc)
# Filter by author
author_comments = filter_comments_by_author(all_comments, author)
# Return results
return json.dumps({
'success': True,
'author': author,
'comments': author_comments,
'total_comments': len(author_comments)
}, indent=2)
except Exception as e:
return json.dumps({
'success': False,
'error': f'Failed to extract comments: {str(e)}'
}, indent=2)
async def get_comments_for_paragraph(filename: str, paragraph_index: int) -> str:
"""
Extract comments for a specific paragraph in a Word document.
Args:
filename: Path to the Word document
paragraph_index: Index of the paragraph (0-based)
Returns:
JSON string containing comments for the specified paragraph
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return json.dumps({
'success': False,
'error': f'Document {filename} does not exist'
}, indent=2)
if paragraph_index < 0:
return json.dumps({
'success': False,
'error': 'Paragraph index must be non-negative'
}, indent=2)
try:
# Load the document
doc = Document(filename)
# Check if paragraph index is valid
if paragraph_index >= len(doc.paragraphs):
return json.dumps({
'success': False,
'error': f'Paragraph index {paragraph_index} is out of range. Document has {len(doc.paragraphs)} paragraphs.'
}, indent=2)
# Extract all comments
all_comments = extract_all_comments(doc)
# Filter for the specific paragraph
from word_document_server.core.comments import get_comments_for_paragraph as core_get_comments_for_paragraph
para_comments = core_get_comments_for_paragraph(all_comments, paragraph_index)
# Get the paragraph text for context
paragraph_text = doc.paragraphs[paragraph_index].text
# Return results
return json.dumps({
'success': True,
'paragraph_index': paragraph_index,
'paragraph_text': paragraph_text,
'comments': para_comments,
'total_comments': len(para_comments)
}, indent=2)
except Exception as e:
return json.dumps({
'success': False,
'error': f'Failed to extract comments: {str(e)}'
}, indent=2)
```
--------------------------------------------------------------------------------
/word_document_server/utils/extended_document_utils.py:
--------------------------------------------------------------------------------
```python
"""
Extended document utilities for Word Document Server.
"""
from typing import Dict, List, Any, Tuple
from docx import Document
def get_paragraph_text(doc_path: str, paragraph_index: int) -> Dict[str, Any]:
"""
Get text from a specific paragraph in a Word document.
Args:
doc_path: Path to the Word document
paragraph_index: Index of the paragraph to extract (0-based)
Returns:
Dictionary with paragraph text and metadata
"""
import os
if not os.path.exists(doc_path):
return {"error": f"Document {doc_path} does not exist"}
try:
doc = Document(doc_path)
# Check if paragraph index is valid
if paragraph_index < 0 or paragraph_index >= len(doc.paragraphs):
return {"error": f"Invalid paragraph index: {paragraph_index}. Document has {len(doc.paragraphs)} paragraphs."}
paragraph = doc.paragraphs[paragraph_index]
return {
"index": paragraph_index,
"text": paragraph.text,
"style": paragraph.style.name if paragraph.style else "Normal",
"is_heading": paragraph.style.name.startswith("Heading") if paragraph.style else False
}
except Exception as e:
return {"error": f"Failed to get paragraph text: {str(e)}"}
def find_text(doc_path: str, text_to_find: str, match_case: bool = True, whole_word: bool = False) -> Dict[str, Any]:
"""
Find all occurrences of specific text in a Word document.
Args:
doc_path: Path to the Word document
text_to_find: Text to search for
match_case: Whether to perform case-sensitive search
whole_word: Whether to match whole words only
Returns:
Dictionary with search results
"""
import os
if not os.path.exists(doc_path):
return {"error": f"Document {doc_path} does not exist"}
if not text_to_find:
return {"error": "Search text cannot be empty"}
try:
doc = Document(doc_path)
results = {
"query": text_to_find,
"match_case": match_case,
"whole_word": whole_word,
"occurrences": [],
"total_count": 0
}
# Search in paragraphs
for i, para in enumerate(doc.paragraphs):
# Prepare text for comparison
para_text = para.text
search_text = text_to_find
if not match_case:
para_text = para_text.lower()
search_text = search_text.lower()
# Find all occurrences (simple implementation)
start_pos = 0
while True:
if whole_word:
# For whole word search, we need to check word boundaries
words = para_text.split()
found = False
for word_idx, word in enumerate(words):
if (word == search_text or
(not match_case and word.lower() == search_text.lower())):
results["occurrences"].append({
"paragraph_index": i,
"position": word_idx,
"context": para.text[:100] + ("..." if len(para.text) > 100 else "")
})
results["total_count"] += 1
found = True
# Break after checking all words
break
else:
# For substring search
pos = para_text.find(search_text, start_pos)
if pos == -1:
break
results["occurrences"].append({
"paragraph_index": i,
"position": pos,
"context": para.text[:100] + ("..." if len(para.text) > 100 else "")
})
results["total_count"] += 1
start_pos = pos + len(search_text)
# Search in tables
for table_idx, table in enumerate(doc.tables):
for row_idx, row in enumerate(table.rows):
for col_idx, cell in enumerate(row.cells):
for para_idx, para in enumerate(cell.paragraphs):
# Prepare text for comparison
para_text = para.text
search_text = text_to_find
if not match_case:
para_text = para_text.lower()
search_text = search_text.lower()
# Find all occurrences (simple implementation)
start_pos = 0
while True:
if whole_word:
# For whole word search, check word boundaries
words = para_text.split()
found = False
for word_idx, word in enumerate(words):
if (word == search_text or
(not match_case and word.lower() == search_text.lower())):
results["occurrences"].append({
"location": f"Table {table_idx}, Row {row_idx}, Column {col_idx}",
"position": word_idx,
"context": para.text[:100] + ("..." if len(para.text) > 100 else "")
})
results["total_count"] += 1
found = True
# Break after checking all words
break
else:
# For substring search
pos = para_text.find(search_text, start_pos)
if pos == -1:
break
results["occurrences"].append({
"location": f"Table {table_idx}, Row {row_idx}, Column {col_idx}",
"position": pos,
"context": para.text[:100] + ("..." if len(para.text) > 100 else "")
})
results["total_count"] += 1
start_pos = pos + len(search_text)
return results
except Exception as e:
return {"error": f"Failed to search for text: {str(e)}"}
```
--------------------------------------------------------------------------------
/word_document_server/core/comments.py:
--------------------------------------------------------------------------------
```python
"""
Core comment extraction functionality for Word documents.
This module provides low-level functions to extract and process comments
from Word documents using the python-docx library.
"""
import datetime
from typing import Dict, List, Optional, Any
from docx import Document
from docx.document import Document as DocumentType
from docx.text.paragraph import Paragraph
def extract_all_comments(doc: DocumentType) -> List[Dict[str, Any]]:
"""
Extract all comments from a Word document.
Args:
doc: The Document object to extract comments from
Returns:
List of dictionaries containing comment information
"""
comments = []
# Access the document's comment part if it exists
try:
# Get the document part
document_part = doc.part
# Find comments part through relationships
comments_part = None
for rel_id, rel in document_part.rels.items():
if 'comments' in rel.reltype and 'comments' == rel.reltype.split('/')[-1]:
comments_part = rel.target_part
break
if comments_part:
# Extract comments from the comments part using proper xpath syntax
comment_elements = comments_part.element.xpath('.//w:comment')
for idx, comment_element in enumerate(comment_elements):
comment_data = extract_comment_data(comment_element, idx)
if comment_data:
comments.append(comment_data)
# If no comments found, try alternative approach
if not comments:
# Fallback: scan paragraphs for comment references
comments = extract_comments_from_paragraphs(doc)
except Exception as e:
# If direct access fails, try alternative approach
comments = extract_comments_from_paragraphs(doc)
return comments
def extract_comments_from_paragraphs(doc: DocumentType) -> List[Dict[str, Any]]:
"""
Extract comments by scanning paragraphs for comment references.
Args:
doc: The Document object
Returns:
List of comment dictionaries
"""
comments = []
comment_id = 1
# Check all paragraphs in the document
for para_idx, paragraph in enumerate(doc.paragraphs):
para_comments = find_paragraph_comments(paragraph, para_idx, comment_id)
comments.extend(para_comments)
comment_id += len(para_comments)
# Check paragraphs in tables
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
for para_idx, paragraph in enumerate(cell.paragraphs):
para_comments = find_paragraph_comments(paragraph, para_idx, comment_id, in_table=True)
comments.extend(para_comments)
comment_id += len(para_comments)
return comments
def extract_comment_data(comment_element, index: int) -> Optional[Dict[str, Any]]:
"""
Extract data from a comment XML element.
Args:
comment_element: The XML comment element
index: Index for generating a unique ID
Returns:
Dictionary with comment data or None
"""
try:
# Extract comment attributes
comment_id = comment_element.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}id', str(index))
author = comment_element.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}author', 'Unknown')
initials = comment_element.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}initials', '')
date_str = comment_element.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}date', '')
# Parse date if available
date = None
if date_str:
try:
date = datetime.datetime.fromisoformat(date_str.replace('Z', '+00:00'))
date = date.isoformat()
except:
date = date_str
# Extract comment text
text_elements = comment_element.xpath('.//w:t')
text = ''.join(elem.text or '' for elem in text_elements)
return {
'id': f'comment_{index + 1}',
'comment_id': comment_id,
'author': author,
'initials': initials,
'date': date,
'text': text.strip(),
'paragraph_index': None, # Will be filled if we can determine it
'in_table': False,
'reference_text': ''
}
except Exception as e:
return None
def find_paragraph_comments(paragraph: Paragraph, para_index: int,
start_id: int, in_table: bool = False) -> List[Dict[str, Any]]:
"""
Find comments associated with a specific paragraph.
Args:
paragraph: The paragraph to check
para_index: The index of the paragraph
start_id: Starting ID for comments
in_table: Whether the paragraph is in a table
Returns:
List of comment dictionaries
"""
comments = []
try:
# Access the paragraph's XML element
para_xml = paragraph._element
# Look for comment range markers (simplified approach)
# This is a basic implementation - the full version would need more sophisticated XML parsing
xml_text = str(para_xml)
# Simple check for comment references in the XML
if 'commentRangeStart' in xml_text or 'commentReference' in xml_text:
# Create a placeholder comment entry
comment_info = {
'id': f'comment_{start_id}',
'comment_id': f'{start_id}',
'author': 'Unknown',
'initials': '',
'date': None,
'text': 'Comment detected but content not accessible',
'paragraph_index': para_index,
'in_table': in_table,
'reference_text': paragraph.text[:50] + '...' if len(paragraph.text) > 50 else paragraph.text
}
comments.append(comment_info)
except Exception:
# If we can't access the XML, skip this paragraph
pass
return comments
def filter_comments_by_author(comments: List[Dict[str, Any]], author: str) -> List[Dict[str, Any]]:
"""
Filter comments by author name.
Args:
comments: List of comment dictionaries
author: Author name to filter by (case-insensitive)
Returns:
Filtered list of comments
"""
author_lower = author.lower()
return [c for c in comments if c.get('author', '').lower() == author_lower]
def get_comments_for_paragraph(comments: List[Dict[str, Any]], paragraph_index: int) -> List[Dict[str, Any]]:
"""
Get all comments for a specific paragraph.
Args:
comments: List of all comments
paragraph_index: Index of the paragraph
Returns:
Comments for the specified paragraph
"""
return [c for c in comments if c.get('paragraph_index') == paragraph_index]
```
--------------------------------------------------------------------------------
/word_document_server/tools/document_tools.py:
--------------------------------------------------------------------------------
```python
"""
Document creation and manipulation tools for Word Document Server.
"""
import os
import json
from typing import Dict, List, Optional, Any
from docx import Document
from word_document_server.utils.file_utils import check_file_writeable, ensure_docx_extension, create_document_copy
from word_document_server.utils.document_utils import get_document_properties, extract_document_text, get_document_structure, get_document_xml, insert_header_near_text, insert_line_or_paragraph_near_text
from word_document_server.core.styles import ensure_heading_style, ensure_table_style
async def create_document(filename: str, title: Optional[str] = None, author: Optional[str] = None) -> str:
"""Create a new Word document with optional metadata.
Args:
filename: Name of the document to create (with or without .docx extension)
title: Optional title for the document metadata
author: Optional author for the document metadata
"""
filename = ensure_docx_extension(filename)
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot create document: {error_message}"
try:
doc = Document()
# Set properties if provided
if title:
doc.core_properties.title = title
if author:
doc.core_properties.author = author
# Ensure necessary styles exist
ensure_heading_style(doc)
ensure_table_style(doc)
# Save the document
doc.save(filename)
return f"Document {filename} created successfully"
except Exception as e:
return f"Failed to create document: {str(e)}"
async def get_document_info(filename: str) -> str:
"""Get information about a Word document.
Args:
filename: Path to the Word document
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
try:
properties = get_document_properties(filename)
return json.dumps(properties, indent=2)
except Exception as e:
return f"Failed to get document info: {str(e)}"
async def get_document_text(filename: str) -> str:
"""Extract all text from a Word document.
Args:
filename: Path to the Word document
"""
filename = ensure_docx_extension(filename)
return extract_document_text(filename)
async def get_document_outline(filename: str) -> str:
"""Get the structure of a Word document.
Args:
filename: Path to the Word document
"""
filename = ensure_docx_extension(filename)
structure = get_document_structure(filename)
return json.dumps(structure, indent=2)
async def list_available_documents(directory: str = ".") -> str:
"""List all .docx files in the specified directory.
Args:
directory: Directory to search for Word documents
"""
try:
if not os.path.exists(directory):
return f"Directory {directory} does not exist"
docx_files = [f for f in os.listdir(directory) if f.endswith('.docx')]
if not docx_files:
return f"No Word documents found in {directory}"
result = f"Found {len(docx_files)} Word documents in {directory}:\n"
for file in docx_files:
file_path = os.path.join(directory, file)
size = os.path.getsize(file_path) / 1024 # KB
result += f"- {file} ({size:.2f} KB)\n"
return result
except Exception as e:
return f"Failed to list documents: {str(e)}"
async def copy_document(source_filename: str, destination_filename: Optional[str] = None) -> str:
"""Create a copy of a Word document.
Args:
source_filename: Path to the source document
destination_filename: Optional path for the copy. If not provided, a default name will be generated.
"""
source_filename = ensure_docx_extension(source_filename)
if destination_filename:
destination_filename = ensure_docx_extension(destination_filename)
success, message, new_path = create_document_copy(source_filename, destination_filename)
if success:
return message
else:
return f"Failed to copy document: {message}"
async def merge_documents(target_filename: str, source_filenames: List[str], add_page_breaks: bool = True) -> str:
"""Merge multiple Word documents into a single document.
Args:
target_filename: Path to the target document (will be created or overwritten)
source_filenames: List of paths to source documents to merge
add_page_breaks: If True, add page breaks between documents
"""
from word_document_server.core.tables import copy_table
target_filename = ensure_docx_extension(target_filename)
# Check if target file is writeable
is_writeable, error_message = check_file_writeable(target_filename)
if not is_writeable:
return f"Cannot create target document: {error_message}"
# Validate all source documents exist
missing_files = []
for filename in source_filenames:
doc_filename = ensure_docx_extension(filename)
if not os.path.exists(doc_filename):
missing_files.append(doc_filename)
if missing_files:
return f"Cannot merge documents. The following source files do not exist: {', '.join(missing_files)}"
try:
# Create a new document for the merged result
target_doc = Document()
# Process each source document
for i, filename in enumerate(source_filenames):
doc_filename = ensure_docx_extension(filename)
source_doc = Document(doc_filename)
# Add page break between documents (except before the first one)
if add_page_breaks and i > 0:
target_doc.add_page_break()
# Copy all paragraphs
for paragraph in source_doc.paragraphs:
# Create a new paragraph with the same text and style
new_paragraph = target_doc.add_paragraph(paragraph.text)
new_paragraph.style = target_doc.styles['Normal'] # Default style
# Try to match the style if possible
try:
if paragraph.style and paragraph.style.name in target_doc.styles:
new_paragraph.style = target_doc.styles[paragraph.style.name]
except:
pass
# Copy run formatting
for i, run in enumerate(paragraph.runs):
if i < len(new_paragraph.runs):
new_run = new_paragraph.runs[i]
# Copy basic formatting
new_run.bold = run.bold
new_run.italic = run.italic
new_run.underline = run.underline
# Font size if specified
if run.font.size:
new_run.font.size = run.font.size
# Copy all tables
for table in source_doc.tables:
copy_table(table, target_doc)
# Save the merged document
target_doc.save(target_filename)
return f"Successfully merged {len(source_filenames)} documents into {target_filename}"
except Exception as e:
return f"Failed to merge documents: {str(e)}"
async def get_document_xml_tool(filename: str) -> str:
"""Get the raw XML structure of a Word document."""
return get_document_xml(filename)
```
--------------------------------------------------------------------------------
/word_document_server/tools/extended_document_tools.py:
--------------------------------------------------------------------------------
```python
"""
Extended document tools for Word Document Server.
These tools provide enhanced document content extraction and search capabilities.
"""
import os
import json
import subprocess
import platform
import shutil
from typing import Dict, List, Optional, Any, Union, Tuple
from docx import Document
from word_document_server.utils.file_utils import check_file_writeable, ensure_docx_extension
from word_document_server.utils.extended_document_utils import get_paragraph_text, find_text
async def get_paragraph_text_from_document(filename: str, paragraph_index: int) -> str:
"""Get text from a specific paragraph in a Word document.
Args:
filename: Path to the Word document
paragraph_index: Index of the paragraph to retrieve (0-based)
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
if paragraph_index < 0:
return "Invalid parameter: paragraph_index must be a non-negative integer"
try:
result = get_paragraph_text(filename, paragraph_index)
return json.dumps(result, indent=2)
except Exception as e:
return f"Failed to get paragraph text: {str(e)}"
async def find_text_in_document(filename: str, text_to_find: str, match_case: bool = True, whole_word: bool = False) -> str:
"""Find occurrences of specific text in a Word document.
Args:
filename: Path to the Word document
text_to_find: Text to search for in the document
match_case: Whether to match case (True) or ignore case (False)
whole_word: Whether to match whole words only (True) or substrings (False)
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
if not text_to_find:
return "Search text cannot be empty"
try:
result = find_text(filename, text_to_find, match_case, whole_word)
return json.dumps(result, indent=2)
except Exception as e:
return f"Failed to search for text: {str(e)}"
async def convert_to_pdf(filename: str, output_filename: Optional[str] = None) -> str:
"""Convert a Word document to PDF format.
Args:
filename: Path to the Word document
output_filename: Optional path for the output PDF. If not provided,
will use the same name with .pdf extension
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Generate output filename if not provided
if not output_filename:
base_name, _ = os.path.splitext(filename)
output_filename = f"{base_name}.pdf"
elif not output_filename.lower().endswith('.pdf'):
output_filename = f"{output_filename}.pdf"
# Convert to absolute path if not already
if not os.path.isabs(output_filename):
output_filename = os.path.abspath(output_filename)
# Ensure the output directory exists
output_dir = os.path.dirname(output_filename)
if not output_dir:
output_dir = os.path.abspath('.')
# Create the directory if it doesn't exist
os.makedirs(output_dir, exist_ok=True)
# Check if output file can be written
is_writeable, error_message = check_file_writeable(output_filename)
if not is_writeable:
return f"Cannot create PDF: {error_message} (Path: {output_filename}, Dir: {output_dir})"
try:
# Determine platform for appropriate conversion method
system = platform.system()
if system == "Windows":
# On Windows, try docx2pdf which uses Microsoft Word
try:
from docx2pdf import convert
convert(filename, output_filename)
return f"Document successfully converted to PDF: {output_filename}"
except (ImportError, Exception) as e:
return f"Failed to convert document to PDF: {str(e)}\nNote: docx2pdf requires Microsoft Word to be installed."
elif system in ["Linux", "Darwin"]: # Linux or macOS
errors = []
# --- Attempt 1: LibreOffice ---
lo_commands = []
if system == "Darwin": # macOS
lo_commands = ["soffice", "/Applications/LibreOffice.app/Contents/MacOS/soffice"]
else: # Linux
lo_commands = ["libreoffice", "soffice"]
for cmd_name in lo_commands:
try:
output_dir_for_lo = os.path.dirname(output_filename) or '.'
os.makedirs(output_dir_for_lo, exist_ok=True)
cmd = [cmd_name, '--headless', '--convert-to', 'pdf', '--outdir', output_dir_for_lo, filename]
result = subprocess.run(cmd, capture_output=True, text=True, timeout=60, check=False)
if result.returncode == 0:
# LibreOffice typically creates a PDF with the same base name as the source file.
# e.g., 'mydoc.docx' -> 'mydoc.pdf'
base_name = os.path.splitext(os.path.basename(filename))[0]
created_pdf_name = f"{base_name}.pdf"
created_pdf_path = os.path.join(output_dir_for_lo, created_pdf_name)
# If the created file exists, move it to the desired output_filename if necessary.
if os.path.exists(created_pdf_path):
if created_pdf_path != output_filename:
shutil.move(created_pdf_path, output_filename)
# Final check: does the target file now exist?
if os.path.exists(output_filename):
return f"Document successfully converted to PDF via {cmd_name}: {output_filename}"
# If we get here, soffice returned 0 but the expected file wasn't created.
errors.append(f"{cmd_name} returned success code, but output file '{created_pdf_path}' was not found.")
# Continue to the next command or fallback.
else:
errors.append(f"{cmd_name} failed. Stderr: {result.stderr.strip()}")
except FileNotFoundError:
errors.append(f"Command '{cmd_name}' not found.")
except (subprocess.SubprocessError, Exception) as e:
errors.append(f"An error occurred with {cmd_name}: {str(e)}")
# --- Attempt 2: docx2pdf (Fallback) ---
try:
from docx2pdf import convert
convert(filename, output_filename)
if os.path.exists(output_filename) and os.path.getsize(output_filename) > 0:
return f"Document successfully converted to PDF via docx2pdf: {output_filename}"
else:
errors.append("docx2pdf fallback was executed but failed to create a valid output file.")
except ImportError:
errors.append("docx2pdf is not installed, skipping fallback.")
except Exception as e:
errors.append(f"docx2pdf fallback failed with an exception: {str(e)}")
# --- If all attempts failed ---
error_summary = "Failed to convert document to PDF using all available methods.\n"
error_summary += "Recorded errors: " + "; ".join(errors) + "\n"
error_summary += "To convert documents to PDF, please install either:\n"
error_summary += "1. LibreOffice (recommended for Linux/macOS)\n"
error_summary += "2. Microsoft Word (required for docx2pdf on Windows/macOS)"
return error_summary
else:
return f"PDF conversion not supported on {system} platform"
except Exception as e:
return f"Failed to convert document to PDF: {str(e)}"
```
--------------------------------------------------------------------------------
/word_document_server/core/protection.py:
--------------------------------------------------------------------------------
```python
"""
Document protection functionality for Word Document Server.
"""
import os
import json
import hashlib
import datetime
from typing import Dict, List, Tuple, Optional, Any
def add_protection_info(doc_path: str, protection_type: str, password_hash: str,
sections: Optional[List[str]] = None,
signature_info: Optional[Dict[str, Any]] = None,
raw_password: Optional[str] = None) -> bool:
"""
Add document protection information to a separate metadata file and encrypt the document.
Args:
doc_path: Path to the document
protection_type: Type of protection ('password', 'restricted', 'signature')
password_hash: Hashed password for security
sections: List of section names that can be edited (for restricted editing)
signature_info: Information about digital signature
raw_password: The actual password for document encryption
Returns:
True if protection info was successfully added, False otherwise
"""
# Create metadata filename based on document path
base_path, _ = os.path.splitext(doc_path)
metadata_path = f"{base_path}.protection"
# Prepare protection data
protection_data = {
"type": protection_type,
"password_hash": password_hash,
"applied_date": datetime.datetime.now().isoformat(),
}
if sections:
protection_data["editable_sections"] = sections
if signature_info:
protection_data["signature"] = signature_info
# Write protection info to metadata file
try:
with open(metadata_path, 'w') as f:
json.dump(protection_data, f, indent=2)
# Apply actual document encryption if raw_password is provided
if protection_type == "password" and raw_password:
import msoffcrypto
import tempfile
import shutil
# Create a temporary file for the encrypted output
temp_fd, temp_path = tempfile.mkstemp(suffix='.docx')
os.close(temp_fd)
try:
# Open the document
with open(doc_path, 'rb') as f:
office_file = msoffcrypto.OfficeFile(f)
# Encrypt with password
office_file.load_key(password=raw_password)
# Write the encrypted file to the temp path
with open(temp_path, 'wb') as out_file:
office_file.encrypt(out_file)
# Replace original with encrypted version
shutil.move(temp_path, doc_path)
# Update metadata to note that true encryption was applied
protection_data["true_encryption"] = True
with open(metadata_path, 'w') as f:
json.dump(protection_data, f, indent=2)
except Exception as e:
print(f"Encryption error: {str(e)}")
if os.path.exists(temp_path):
os.unlink(temp_path)
return False
return True
except Exception as e:
print(f"Protection error: {str(e)}")
return False
def verify_document_protection(doc_path: str, password: Optional[str] = None) -> Tuple[bool, str]:
"""
Verify if a document is protected and if the password is correct.
Args:
doc_path: Path to the document
password: Password to verify
Returns:
Tuple of (is_protected_and_verified, message)
"""
base_path, _ = os.path.splitext(doc_path)
metadata_path = f"{base_path}.protection"
# Check if protection metadata exists
if not os.path.exists(metadata_path):
return False, "Document is not protected"
try:
# Read protection data
with open(metadata_path, 'r') as f:
protection_data = json.load(f)
# If password is provided, verify it
if password:
password_hash = hashlib.sha256(password.encode()).hexdigest()
if password_hash != protection_data.get("password_hash"):
return False, "Incorrect password"
# Return protection type
protection_type = protection_data.get("type", "unknown")
return True, f"Document is protected with {protection_type} protection"
except Exception as e:
return False, f"Error verifying protection: {str(e)}"
def is_section_editable(doc_path: str, section_name: str) -> bool:
"""
Check if a specific section of a document is editable.
Args:
doc_path: Path to the document
section_name: Name of the section to check
Returns:
True if section is editable, False otherwise
"""
base_path, _ = os.path.splitext(doc_path)
metadata_path = f"{base_path}.protection"
# Check if protection metadata exists
if not os.path.exists(metadata_path):
# If no protection exists, all sections are editable
return True
try:
# Read protection data
with open(metadata_path, 'r') as f:
protection_data = json.load(f)
# Check protection type
if protection_data.get("type") != "restricted":
# If not restricted editing, return based on protection type
return protection_data.get("type") != "password"
# Check if the section is in the list of editable sections
editable_sections = protection_data.get("editable_sections", [])
return section_name in editable_sections
except Exception:
# In case of error, default to not editable for security
return False
def create_signature_info(doc, signer_name: str, reason: Optional[str] = None) -> Dict[str, Any]:
"""
Create signature information for a document.
Args:
doc: Document object
signer_name: Name of the person signing the document
reason: Optional reason for signing
Returns:
Dictionary containing signature information
"""
# Create signature info
signature_info = {
"signer": signer_name,
"timestamp": datetime.datetime.now().isoformat(),
}
if reason:
signature_info["reason"] = reason
# Generate a simple signature hash based on document content and metadata
text_content = "\n".join([p.text for p in doc.paragraphs])
content_hash = hashlib.sha256(text_content.encode()).hexdigest()
signature_info["content_hash"] = content_hash
return signature_info
def verify_signature(doc_path: str) -> Tuple[bool, str]:
"""
Verify a document's digital signature.
Args:
doc_path: Path to the document
Returns:
Tuple of (is_valid, message)
"""
from docx import Document
base_path, _ = os.path.splitext(doc_path)
metadata_path = f"{base_path}.protection"
if not os.path.exists(metadata_path):
return False, "Document is not signed"
try:
# Read protection data
with open(metadata_path, 'r') as f:
protection_data = json.load(f)
if protection_data.get("type") != "signature":
return False, f"Document is protected with {protection_data.get('type')} protection, not a signature"
# Get the original content hash
signature_info = protection_data.get("signature", {})
original_hash = signature_info.get("content_hash")
if not original_hash:
return False, "Invalid signature: missing content hash"
# Calculate current content hash
doc = Document(doc_path)
text_content = "\n".join([p.text for p in doc.paragraphs])
current_hash = hashlib.sha256(text_content.encode()).hexdigest()
# Compare hashes
if current_hash != original_hash:
return False, f"Document has been modified since it was signed by {signature_info.get('signer')}"
return True, f"Document signature is valid. Signed by {signature_info.get('signer')} on {signature_info.get('timestamp')}"
except Exception as e:
return False, f"Error verifying signature: {str(e)}"
```
--------------------------------------------------------------------------------
/word_document_server/tools/protection_tools.py:
--------------------------------------------------------------------------------
```python
"""
Protection tools for Word Document Server.
These tools handle document protection features such as
password protection, restricted editing, and digital signatures.
"""
import os
import hashlib
import datetime
import io
from typing import List, Optional, Dict, Any
from docx import Document
import msoffcrypto
from word_document_server.utils.file_utils import check_file_writeable, ensure_docx_extension
from word_document_server.core.protection import (
add_protection_info,
verify_document_protection,
create_signature_info
)
async def protect_document(filename: str, password: str) -> str:
"""Add password protection to a Word document.
Args:
filename: Path to the Word document
password: Password to protect the document with
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot protect document: {error_message}"
try:
# Read the original file content
with open(filename, "rb") as infile:
original_data = infile.read()
# Create an msoffcrypto file object from the original data
file = msoffcrypto.OfficeFile(io.BytesIO(original_data))
file.load_key(password=password) # Set the password for encryption
# Encrypt the data into an in-memory buffer
encrypted_data_io = io.BytesIO()
file.encrypt(password=password, outfile=encrypted_data_io)
# Overwrite the original file with the encrypted data
with open(filename, "wb") as outfile:
outfile.write(encrypted_data_io.getvalue())
base_path, _ = os.path.splitext(filename)
metadata_path = f"{base_path}.protection"
if os.path.exists(metadata_path):
os.remove(metadata_path)
return f"Document {filename} encrypted successfully with password."
except Exception as e:
# Attempt to restore original file content on failure
try:
if 'original_data' in locals():
with open(filename, "wb") as outfile:
outfile.write(original_data)
return f"Failed to encrypt document {filename}: {str(e)}. Original file restored."
else:
return f"Failed to encrypt document {filename}: {str(e)}. Could not restore original file."
except Exception as restore_e:
return f"Failed to encrypt document {filename}: {str(e)}. Also failed to restore original file: {str(restore_e)}"
async def add_restricted_editing(filename: str, password: str, editable_sections: List[str]) -> str:
"""Add restricted editing to a Word document, allowing editing only in specified sections.
Args:
filename: Path to the Word document
password: Password to protect the document with
editable_sections: List of section names that can be edited
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot protect document: {error_message}"
try:
# Hash the password for security
password_hash = hashlib.sha256(password.encode()).hexdigest()
# Add protection info to metadata
success = add_protection_info(
filename,
protection_type="restricted",
password_hash=password_hash,
sections=editable_sections
)
if not editable_sections:
return "No editable sections specified. Document will be fully protected."
if success:
return f"Document {filename} protected with restricted editing. Editable sections: {', '.join(editable_sections)}"
else:
return f"Failed to protect document {filename} with restricted editing"
except Exception as e:
return f"Failed to add restricted editing: {str(e)}"
async def add_digital_signature(filename: str, signer_name: str, reason: Optional[str] = None) -> str:
"""Add a digital signature to a Word document.
Args:
filename: Path to the Word document
signer_name: Name of the person signing the document
reason: Optional reason for signing
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot add signature to document: {error_message}"
try:
doc = Document(filename)
# Create signature info
signature_info = create_signature_info(doc, signer_name, reason)
# Add protection info to metadata
success = add_protection_info(
filename,
protection_type="signature",
password_hash="", # No password for signature-only
signature_info=signature_info
)
if success:
# Add a visible signature block to the document
doc.add_paragraph("").add_run() # Add empty paragraph for spacing
signature_para = doc.add_paragraph()
signature_para.add_run(f"Digitally signed by: {signer_name}").bold = True
if reason:
signature_para.add_run(f"\nReason: {reason}")
signature_para.add_run(f"\nDate: {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
signature_para.add_run(f"\nSignature ID: {signature_info['content_hash'][:8]}")
# Save the document with the visible signature
doc.save(filename)
return f"Digital signature added to document {filename}"
else:
return f"Failed to add digital signature to document {filename}"
except Exception as e:
return f"Failed to add digital signature: {str(e)}"
async def verify_document(filename: str, password: Optional[str] = None) -> str:
"""Verify document protection and/or digital signature.
Args:
filename: Path to the Word document
password: Optional password to verify
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
try:
# Verify document protection
is_verified, message = verify_document_protection(filename, password)
if not is_verified and password:
return f"Document verification failed: {message}"
# If document has a digital signature, verify content integrity
base_path, _ = os.path.splitext(filename)
metadata_path = f"{base_path}.protection"
if os.path.exists(metadata_path):
try:
import json
with open(metadata_path, 'r') as f:
protection_data = json.load(f)
if protection_data.get("type") == "signature":
# Get the original content hash
signature_info = protection_data.get("signature", {})
original_hash = signature_info.get("content_hash")
if original_hash:
# Calculate current content hash
doc = Document(filename)
text_content = "\n".join([p.text for p in doc.paragraphs])
current_hash = hashlib.sha256(text_content.encode()).hexdigest()
# Compare hashes
if current_hash != original_hash:
return f"Document has been modified since it was signed by {signature_info.get('signer')}"
else:
return f"Document signature is valid. Signed by {signature_info.get('signer')} on {signature_info.get('timestamp')}"
except Exception as e:
return f"Error verifying signature: {str(e)}"
return message
except Exception as e:
return f"Failed to verify document: {str(e)}"
async def unprotect_document(filename: str, password: str) -> str:
"""Remove password protection from a Word document.
Args:
filename: Path to the Word document
password: Password that was used to protect the document
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}"
try:
# Read the encrypted file content
with open(filename, "rb") as infile:
encrypted_data = infile.read()
# Create an msoffcrypto file object from the encrypted data
file = msoffcrypto.OfficeFile(io.BytesIO(encrypted_data))
file.load_key(password=password) # Set the password for decryption
# Decrypt the data into an in-memory buffer
decrypted_data_io = io.BytesIO()
file.decrypt(outfile=decrypted_data_io) # Pass the buffer as the 'outfile' argument
# Overwrite the original file with the decrypted data
with open(filename, "wb") as outfile:
outfile.write(decrypted_data_io.getvalue())
return f"Document {filename} decrypted successfully."
except msoffcrypto.exceptions.InvalidKeyError:
return f"Failed to decrypt document {filename}: Incorrect password."
except msoffcrypto.exceptions.InvalidFormatError:
return f"Failed to decrypt document {filename}: File is not encrypted or is not a supported Office format."
except Exception as e:
# Attempt to restore encrypted file content on failure
try:
if 'encrypted_data' in locals():
with open(filename, "wb") as outfile:
outfile.write(encrypted_data)
return f"Failed to decrypt document {filename}: {str(e)}. Encrypted file restored."
else:
return f"Failed to decrypt document {filename}: {str(e)}. Could not restore encrypted file."
except Exception as restore_e:
return f"Failed to decrypt document {filename}: {str(e)}. Also failed to restore encrypted file: {str(restore_e)}"
```
--------------------------------------------------------------------------------
/word_document_server/tools/content_tools.py:
--------------------------------------------------------------------------------
```python
"""
Content tools for Word Document Server.
These tools add various types of content to Word documents,
including headings, paragraphs, tables, images, and page breaks.
"""
import os
from typing import List, Optional, Dict, Any
from docx import Document
from docx.shared import Inches, Pt, RGBColor
from word_document_server.utils.file_utils import check_file_writeable, ensure_docx_extension
from word_document_server.utils.document_utils import find_and_replace_text, insert_header_near_text, insert_numbered_list_near_text, insert_line_or_paragraph_near_text, replace_paragraph_block_below_header, replace_block_between_manual_anchors
from word_document_server.core.styles import ensure_heading_style, ensure_table_style
async def add_heading(filename: str, text: str, level: int = 1,
font_name: Optional[str] = None, font_size: Optional[int] = None,
bold: Optional[bool] = None, italic: Optional[bool] = None,
border_bottom: bool = False) -> str:
"""Add a heading to a Word document with optional formatting.
Args:
filename: Path to the Word document
text: Heading text
level: Heading level (1-9, where 1 is the highest level)
font_name: Font family (e.g., 'Helvetica')
font_size: Font size in points (e.g., 14)
bold: True/False for bold text
italic: True/False for italic text
border_bottom: True to add bottom border (for section headers)
"""
filename = ensure_docx_extension(filename)
# Ensure level is converted to integer
try:
level = int(level)
except (ValueError, TypeError):
return "Invalid parameter: level must be an integer between 1 and 9"
# Validate level range
if level < 1 or level > 9:
return f"Invalid heading level: {level}. Level must be between 1 and 9."
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
# Suggest creating a copy
return f"Cannot modify document: {error_message}. Consider creating a copy first or creating a new document."
try:
doc = Document(filename)
# Ensure heading styles exist
ensure_heading_style(doc)
# Try to add heading with style
try:
heading = doc.add_heading(text, level=level)
except Exception as style_error:
# If style-based approach fails, use direct formatting
heading = doc.add_paragraph(text)
heading.style = doc.styles['Normal']
if heading.runs:
run = heading.runs[0]
run.bold = True
# Adjust size based on heading level
if level == 1:
run.font.size = Pt(16)
elif level == 2:
run.font.size = Pt(14)
else:
run.font.size = Pt(12)
# Apply formatting to all runs in the heading
if any([font_name, font_size, bold is not None, italic is not None]):
for run in heading.runs:
if font_name:
run.font.name = font_name
if font_size:
run.font.size = Pt(font_size)
if bold is not None:
run.font.bold = bold
if italic is not None:
run.font.italic = italic
# Add bottom border if requested
if border_bottom:
from docx.oxml import OxmlElement
from docx.oxml.ns import qn
pPr = heading._element.get_or_add_pPr()
pBdr = OxmlElement('w:pBdr')
bottom = OxmlElement('w:bottom')
bottom.set(qn('w:val'), 'single')
bottom.set(qn('w:sz'), '4') # 0.5pt border
bottom.set(qn('w:space'), '0')
bottom.set(qn('w:color'), '000000')
pBdr.append(bottom)
pPr.append(pBdr)
doc.save(filename)
return f"Heading '{text}' (level {level}) added to {filename}"
except Exception as e:
return f"Failed to add heading: {str(e)}"
async def add_paragraph(filename: str, text: str, style: Optional[str] = None,
font_name: Optional[str] = None, font_size: Optional[int] = None,
bold: Optional[bool] = None, italic: Optional[bool] = None,
color: Optional[str] = None) -> str:
"""Add a paragraph to a Word document with optional formatting.
Args:
filename: Path to the Word document
text: Paragraph text
style: Optional paragraph style name
font_name: Font family (e.g., 'Helvetica', 'Times New Roman')
font_size: Font size in points (e.g., 14, 36)
bold: True/False for bold text
italic: True/False for italic text
color: RGB color as hex string (e.g., '000000' for black)
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
# Suggest creating a copy
return f"Cannot modify document: {error_message}. Consider creating a copy first or creating a new document."
try:
doc = Document(filename)
paragraph = doc.add_paragraph(text)
if style:
try:
paragraph.style = style
except KeyError:
# Style doesn't exist, use normal and report it
paragraph.style = doc.styles['Normal']
doc.save(filename)
return f"Style '{style}' not found, paragraph added with default style to {filename}"
# Apply formatting to all runs in the paragraph
if any([font_name, font_size, bold is not None, italic is not None, color]):
for run in paragraph.runs:
if font_name:
run.font.name = font_name
if font_size:
run.font.size = Pt(font_size)
if bold is not None:
run.font.bold = bold
if italic is not None:
run.font.italic = italic
if color:
# Remove any '#' prefix if present
color_hex = color.lstrip('#')
run.font.color.rgb = RGBColor.from_string(color_hex)
doc.save(filename)
return f"Paragraph added to {filename}"
except Exception as e:
return f"Failed to add paragraph: {str(e)}"
async def add_table(filename: str, rows: int, cols: int, data: Optional[List[List[str]]] = None) -> str:
"""Add a table to a Word document.
Args:
filename: Path to the Word document
rows: Number of rows in the table
cols: Number of columns in the table
data: Optional 2D array of data to fill the table
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
# Suggest creating a copy
return f"Cannot modify document: {error_message}. Consider creating a copy first or creating a new document."
try:
doc = Document(filename)
table = doc.add_table(rows=rows, cols=cols)
# Try to set the table style
try:
table.style = 'Table Grid'
except KeyError:
# If style doesn't exist, add basic borders
pass
# Fill table with data if provided
if data:
for i, row_data in enumerate(data):
if i >= rows:
break
for j, cell_text in enumerate(row_data):
if j >= cols:
break
table.cell(i, j).text = str(cell_text)
doc.save(filename)
return f"Table ({rows}x{cols}) added to {filename}"
except Exception as e:
return f"Failed to add table: {str(e)}"
async def add_picture(filename: str, image_path: str, width: Optional[float] = None) -> str:
"""Add an image to a Word document.
Args:
filename: Path to the Word document
image_path: Path to the image file
width: Optional width in inches (proportional scaling)
"""
filename = ensure_docx_extension(filename)
# Validate document existence
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Get absolute paths for better diagnostics
abs_filename = os.path.abspath(filename)
abs_image_path = os.path.abspath(image_path)
# Validate image existence with improved error message
if not os.path.exists(abs_image_path):
return f"Image file not found: {abs_image_path}"
# Check image file size
try:
image_size = os.path.getsize(abs_image_path) / 1024 # Size in KB
if image_size <= 0:
return f"Image file appears to be empty: {abs_image_path} (0 KB)"
except Exception as size_error:
return f"Error checking image file: {str(size_error)}"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(abs_filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first or creating a new document."
try:
doc = Document(abs_filename)
# Additional diagnostic info
diagnostic = f"Attempting to add image ({abs_image_path}, {image_size:.2f} KB) to document ({abs_filename})"
try:
if width:
doc.add_picture(abs_image_path, width=Inches(width))
else:
doc.add_picture(abs_image_path)
doc.save(abs_filename)
return f"Picture {image_path} added to {filename}"
except Exception as inner_error:
# More detailed error for the specific operation
error_type = type(inner_error).__name__
error_msg = str(inner_error)
return f"Failed to add picture: {error_type} - {error_msg or 'No error details available'}\nDiagnostic info: {diagnostic}"
except Exception as outer_error:
# Fallback error handling
error_type = type(outer_error).__name__
error_msg = str(outer_error)
return f"Document processing error: {error_type} - {error_msg or 'No error details available'}"
async def add_page_break(filename: str) -> str:
"""Add a page break to the document.
Args:
filename: Path to the Word document
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
doc = Document(filename)
doc.add_page_break()
doc.save(filename)
return f"Page break added to {filename}."
except Exception as e:
return f"Failed to add page break: {str(e)}"
async def add_table_of_contents(filename: str, title: str = "Table of Contents", max_level: int = 3) -> str:
"""Add a table of contents to a Word document based on heading styles.
Args:
filename: Path to the Word document
title: Optional title for the table of contents
max_level: Maximum heading level to include (1-9)
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
# Ensure max_level is within valid range
max_level = max(1, min(max_level, 9))
doc = Document(filename)
# Collect headings and their positions
headings = []
for i, paragraph in enumerate(doc.paragraphs):
# Check if paragraph style is a heading
if paragraph.style and paragraph.style.name.startswith('Heading '):
try:
# Extract heading level from style name
level = int(paragraph.style.name.split(' ')[1])
if level <= max_level:
headings.append({
'level': level,
'text': paragraph.text,
'position': i
})
except (ValueError, IndexError):
# Skip if heading level can't be determined
pass
if not headings:
return f"No headings found in document {filename}. Table of contents not created."
# Create a new document with the TOC
toc_doc = Document()
# Add title
if title:
toc_doc.add_heading(title, level=1)
# Add TOC entries
for heading in headings:
# Indent based on level (using tab characters)
indent = ' ' * (heading['level'] - 1)
toc_doc.add_paragraph(f"{indent}{heading['text']}")
# Add page break
toc_doc.add_page_break()
# Get content from original document
for paragraph in doc.paragraphs:
p = toc_doc.add_paragraph(paragraph.text)
# Copy style if possible
try:
if paragraph.style:
p.style = paragraph.style.name
except:
pass
# Copy tables
for table in doc.tables:
# Create a new table with the same dimensions
new_table = toc_doc.add_table(rows=len(table.rows), cols=len(table.columns))
# Copy cell contents
for i, row in enumerate(table.rows):
for j, cell in enumerate(row.cells):
for paragraph in cell.paragraphs:
new_table.cell(i, j).text = paragraph.text
# Save the new document with TOC
toc_doc.save(filename)
return f"Table of contents with {len(headings)} entries added to {filename}"
except Exception as e:
return f"Failed to add table of contents: {str(e)}"
async def delete_paragraph(filename: str, paragraph_index: int) -> str:
"""Delete a paragraph from a document.
Args:
filename: Path to the Word document
paragraph_index: Index of the paragraph to delete (0-based)
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
doc = Document(filename)
# Validate paragraph index
if paragraph_index < 0 or paragraph_index >= len(doc.paragraphs):
return f"Invalid paragraph index. Document has {len(doc.paragraphs)} paragraphs (0-{len(doc.paragraphs)-1})."
# Delete the paragraph (by removing its content and setting it empty)
# Note: python-docx doesn't support true paragraph deletion, this is a workaround
paragraph = doc.paragraphs[paragraph_index]
p = paragraph._p
p.getparent().remove(p)
doc.save(filename)
return f"Paragraph at index {paragraph_index} deleted successfully."
except Exception as e:
return f"Failed to delete paragraph: {str(e)}"
async def search_and_replace(filename: str, find_text: str, replace_text: str) -> str:
"""Search for text and replace all occurrences.
Args:
filename: Path to the Word document
find_text: Text to search for
replace_text: Text to replace with
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
doc = Document(filename)
# Perform find and replace
count = find_and_replace_text(doc, find_text, replace_text)
if count > 0:
doc.save(filename)
return f"Replaced {count} occurrence(s) of '{find_text}' with '{replace_text}'."
else:
return f"No occurrences of '{find_text}' found."
except Exception as e:
return f"Failed to search and replace: {str(e)}"
async def insert_header_near_text_tool(filename: str, target_text: str = None, header_title: str = "", position: str = 'after', header_style: str = 'Heading 1', target_paragraph_index: int = None) -> str:
"""Insert a header (with specified style) before or after the target paragraph. Specify by text or paragraph index."""
return insert_header_near_text(filename, target_text, header_title, position, header_style, target_paragraph_index)
async def insert_numbered_list_near_text_tool(filename: str, target_text: str = None, list_items: list = None, position: str = 'after', target_paragraph_index: int = None, bullet_type: str = 'bullet') -> str:
"""Insert a bulleted or numbered list before or after the target paragraph. Specify by text or paragraph index."""
return insert_numbered_list_near_text(filename, target_text, list_items, position, target_paragraph_index, bullet_type)
async def insert_line_or_paragraph_near_text_tool(filename: str, target_text: str = None, line_text: str = "", position: str = 'after', line_style: str = None, target_paragraph_index: int = None) -> str:
"""Insert a new line or paragraph (with specified or matched style) before or after the target paragraph. Specify by text or paragraph index."""
return insert_line_or_paragraph_near_text(filename, target_text, line_text, position, line_style, target_paragraph_index)
async def replace_paragraph_block_below_header_tool(filename: str, header_text: str, new_paragraphs: list, detect_block_end_fn=None) -> str:
"""Reemplaza el bloque de párrafos debajo de un encabezado, evitando modificar TOC."""
return replace_paragraph_block_below_header(filename, header_text, new_paragraphs, detect_block_end_fn)
async def replace_block_between_manual_anchors_tool(filename: str, start_anchor_text: str, new_paragraphs: list, end_anchor_text: str = None, match_fn=None, new_paragraph_style: str = None) -> str:
"""Replace all content between start_anchor_text and end_anchor_text (or next logical header if not provided)."""
return replace_block_between_manual_anchors(filename, start_anchor_text, new_paragraphs, end_anchor_text, match_fn, new_paragraph_style)
```
--------------------------------------------------------------------------------
/setup_mcp.py:
--------------------------------------------------------------------------------
```python
# Import necessary Python standard libraries
import os
import json
import subprocess
import sys
import shutil
import platform
def check_prerequisites():
"""
Check if necessary prerequisites are installed
Returns:
tuple: (python_ok, uv_installed, uvx_installed, word_server_installed)
"""
# Check Python version
python_version = sys.version_info
python_ok = python_version.major >= 3 and python_version.minor >= 8
# Check if uv/uvx is installed
uv_installed = shutil.which("uv") is not None
uvx_installed = shutil.which("uvx") is not None
# Check if word-document-server is already installed via pip
try:
result = subprocess.run(
[sys.executable, "-m", "pip", "show", "word-document-server"],
capture_output=True,
text=True,
check=False
)
word_server_installed = result.returncode == 0
except Exception:
word_server_installed = False
return (python_ok, uv_installed, uvx_installed, word_server_installed)
def get_transport_choice():
"""
Ask user to choose transport type
Returns:
dict: Transport configuration
"""
print("\nTransport Configuration:")
print("1. STDIO (default, local execution)")
print("2. Streamable HTTP (modern, recommended for web deployment)")
print("3. SSE (Server-Sent Events, for compatibility)")
choice = input("\nSelect transport type (1-3, default: 1): ").strip()
if choice == "2":
host = input("Host (default: 127.0.0.1): ").strip() or "127.0.0.1"
port = input("Port (default: 8000): ").strip() or "8000"
path = input("Path (default: /mcp): ").strip() or "/mcp"
return {
"transport": "streamable-http",
"host": host,
"port": port,
"path": path
}
elif choice == "3":
host = input("Host (default: 127.0.0.1): ").strip() or "127.0.0.1"
port = input("Port (default: 8000): ").strip() or "8000"
sse_path = input("SSE Path (default: /sse): ").strip() or "/sse"
return {
"transport": "sse",
"host": host,
"port": port,
"sse_path": sse_path
}
else:
# Default to stdio
return {
"transport": "stdio"
}
def setup_venv():
"""
Function to set up Python virtual environment
Features:
- Checks if Python version meets requirements (3.8+)
- Creates Python virtual environment (if it doesn't exist)
- Installs required dependencies in the newly created virtual environment
No parameters required
Returns: Path to Python interpreter in the virtual environment
"""
# Check Python version
python_version = sys.version_info
if python_version.major < 3 or (python_version.major == 3 and python_version.minor < 8):
print("Error: Python 3.8 or higher is required.")
sys.exit(1)
# Get absolute path of the directory containing the current script
base_path = os.path.abspath(os.path.dirname(__file__))
# Set virtual environment directory path
venv_path = os.path.join(base_path, '.venv')
# Determine pip and python executable paths based on operating system
is_windows = platform.system() == "Windows"
if is_windows:
pip_path = os.path.join(venv_path, 'Scripts', 'pip.exe')
python_path = os.path.join(venv_path, 'Scripts', 'python.exe')
else:
pip_path = os.path.join(venv_path, 'bin', 'pip')
python_path = os.path.join(venv_path, 'bin', 'python')
# Check if virtual environment already exists and is valid
venv_exists = os.path.exists(venv_path)
pip_exists = os.path.exists(pip_path)
if not venv_exists or not pip_exists:
print("Creating new virtual environment...")
# Remove existing venv if it's invalid
if venv_exists and not pip_exists:
print("Existing virtual environment is incomplete, recreating it...")
try:
shutil.rmtree(venv_path)
except Exception as e:
print(f"Warning: Could not remove existing virtual environment: {e}")
print("Please delete the .venv directory manually and try again.")
sys.exit(1)
# Create virtual environment
try:
subprocess.run([sys.executable, '-m', 'venv', venv_path], check=True)
print("Virtual environment created successfully!")
except subprocess.CalledProcessError as e:
print(f"Error creating virtual environment: {e}")
sys.exit(1)
else:
print("Valid virtual environment already exists.")
# Double-check that pip exists after creating venv
if not os.path.exists(pip_path):
print(f"Error: pip executable not found at {pip_path}")
print("Try creating the virtual environment manually with: python -m venv .venv")
sys.exit(1)
# Install or update dependencies
print("\nInstalling requirements...")
try:
# Install FastMCP package (standalone library)
subprocess.run([pip_path, 'install', 'fastmcp'], check=True)
# Install python-docx package
subprocess.run([pip_path, 'install', 'python-docx'], check=True)
# Also install dependencies from requirements.txt if it exists
requirements_path = os.path.join(base_path, 'requirements.txt')
if os.path.exists(requirements_path):
subprocess.run([pip_path, 'install', '-r', requirements_path], check=True)
print("Requirements installed successfully!")
except subprocess.CalledProcessError as e:
print(f"Error installing requirements: {e}")
sys.exit(1)
except FileNotFoundError:
print(f"Error: Could not execute {pip_path}")
print("Try activating the virtual environment manually and installing requirements:")
if is_windows:
print(f".venv\\Scripts\\activate")
else:
print("source .venv/bin/activate")
print("pip install mcp[cli] python-docx")
sys.exit(1)
return python_path
def generate_mcp_config_local(python_path, transport_config):
"""
Generate MCP configuration for locally installed word-document-server
Parameters:
- python_path: Path to Python interpreter in the virtual environment
- transport_config: Transport configuration dictionary
Returns: Path to the generated config file
"""
# Get absolute path of the directory containing the current script
base_path = os.path.abspath(os.path.dirname(__file__))
# Path to Word Document Server script
server_script_path = os.path.join(base_path, 'word_mcp_server.py')
# Build environment variables
env = {
"PYTHONPATH": base_path,
"MCP_TRANSPORT": transport_config["transport"]
}
# Add transport-specific environment variables
if transport_config["transport"] == "streamable-http":
env.update({
"MCP_HOST": transport_config["host"],
"MCP_PORT": transport_config["port"],
"MCP_PATH": transport_config["path"]
})
elif transport_config["transport"] == "sse":
env.update({
"MCP_HOST": transport_config["host"],
"MCP_PORT": transport_config["port"],
"MCP_SSE_PATH": transport_config["sse_path"]
})
# For stdio transport, no additional environment variables needed
# Create MCP configuration dictionary
config = {
"mcpServers": {
"word-document-server": {
"command": python_path,
"args": [server_script_path],
"env": env
}
}
}
# Save configuration to JSON file
config_path = os.path.join(base_path, 'mcp-config.json')
with open(config_path, 'w') as f:
json.dump(config, f, indent=2)
return config_path
def generate_mcp_config_uvx(transport_config):
"""
Generate MCP configuration for PyPI-installed word-document-server using UVX
Parameters:
- transport_config: Transport configuration dictionary
Returns: Path to the generated config file
"""
# Get absolute path of the directory containing the current script
base_path = os.path.abspath(os.path.dirname(__file__))
# Build environment variables
env = {
"MCP_TRANSPORT": transport_config["transport"]
}
# Add transport-specific environment variables
if transport_config["transport"] == "streamable-http":
env.update({
"MCP_HOST": transport_config["host"],
"MCP_PORT": transport_config["port"],
"MCP_PATH": transport_config["path"]
})
elif transport_config["transport"] == "sse":
env.update({
"MCP_HOST": transport_config["host"],
"MCP_PORT": transport_config["port"],
"MCP_SSE_PATH": transport_config["sse_path"]
})
# For stdio transport, no additional environment variables needed
# Create MCP configuration dictionary
config = {
"mcpServers": {
"word-document-server": {
"command": "uvx",
"args": ["--from", "word-mcp-server", "word_mcp_server"],
"env": env
}
}
}
# Save configuration to JSON file
config_path = os.path.join(base_path, 'mcp-config.json')
with open(config_path, 'w') as f:
json.dump(config, f, indent=2)
return config_path
def generate_mcp_config_module(transport_config):
"""
Generate MCP configuration for PyPI-installed word-document-server using Python module
Parameters:
- transport_config: Transport configuration dictionary
Returns: Path to the generated config file
"""
# Get absolute path of the directory containing the current script
base_path = os.path.abspath(os.path.dirname(__file__))
# Build environment variables
env = {
"MCP_TRANSPORT": transport_config["transport"]
}
# Add transport-specific environment variables
if transport_config["transport"] == "streamable-http":
env.update({
"MCP_HOST": transport_config["host"],
"MCP_PORT": transport_config["port"],
"MCP_PATH": transport_config["path"]
})
elif transport_config["transport"] == "sse":
env.update({
"MCP_HOST": transport_config["host"],
"MCP_PORT": transport_config["port"],
"MCP_SSE_PATH": transport_config["sse_path"]
})
# Create MCP configuration dictionary
config = {
"mcpServers": {
"word-document-server": {
"command": sys.executable,
"args": ["-m", "word_document_server"],
"env": env
}
}
}
# Save configuration to JSON file
config_path = os.path.join(base_path, 'mcp-config.json')
with open(config_path, 'w') as f:
json.dump(config, f, indent=2)
return config_path
def install_from_pypi():
"""
Install word-document-server from PyPI
Returns: True if successful, False otherwise
"""
print("\nInstalling word-document-server from PyPI...")
try:
subprocess.run([sys.executable, "-m", "pip", "install", "word-mcp-server"], check=True)
print("word-mcp-server successfully installed from PyPI!")
return True
except subprocess.CalledProcessError:
print("Failed to install word-mcp-server from PyPI.")
return False
def print_config_instructions(config_path, transport_config):
"""
Print instructions for using the generated config
Parameters:
- config_path: Path to the generated config file
- transport_config: Transport configuration dictionary
"""
print(f"\nMCP configuration has been written to: {config_path}")
with open(config_path, 'r') as f:
config = json.load(f)
print("\nMCP configuration for Claude Desktop:")
print(json.dumps(config, indent=2))
# Print transport-specific instructions
if transport_config["transport"] == "streamable-http":
print(f"\n📡 Streamable HTTP Transport Configuration:")
print(f" Server will be accessible at: http://{transport_config['host']}:{transport_config['port']}{transport_config['path']}")
print(f" \n To test the server manually:")
print(f" curl -X POST http://{transport_config['host']}:{transport_config['port']}{transport_config['path']}")
elif transport_config["transport"] == "sse":
print(f"\n📡 SSE Transport Configuration:")
print(f" Server will be accessible at: http://{transport_config['host']}:{transport_config['port']}{transport_config['sse_path']}")
print(f" \n To test the server manually:")
print(f" curl http://{transport_config['host']}:{transport_config['port']}{transport_config['sse_path']}")
else: # stdio
print(f"\n💻 STDIO Transport Configuration:")
print(f" Server runs locally with standard input/output")
# Provide instructions for adding configuration to Claude Desktop configuration file
if platform.system() == "Windows":
claude_config_path = os.path.expandvars("%APPDATA%\\Claude\\claude_desktop_config.json")
else: # macOS
claude_config_path = os.path.expanduser("~/Library/Application Support/Claude/claude_desktop_config.json")
print(f"\nTo use with Claude Desktop, merge this configuration into: {claude_config_path}")
def create_package_structure():
"""
Create necessary package structure and environment files
"""
# Get absolute path of the directory containing the current script
base_path = os.path.abspath(os.path.dirname(__file__))
# Create __init__.py file
init_path = os.path.join(base_path, '__init__.py')
if not os.path.exists(init_path):
with open(init_path, 'w') as f:
f.write('# Word Document MCP Server')
print(f"Created __init__.py at: {init_path}")
# Create requirements.txt file
requirements_path = os.path.join(base_path, 'requirements.txt')
if not os.path.exists(requirements_path):
with open(requirements_path, 'w') as f:
f.write('fastmcp\npython-docx\nmsoffcrypto-tool\ndocx2pdf\nhttpx\ncryptography\n')
print(f"Created requirements.txt at: {requirements_path}")
# Create .env.example file
env_example_path = os.path.join(base_path, '.env.example')
if not os.path.exists(env_example_path):
with open(env_example_path, 'w') as f:
f.write("""# Transport Configuration
# Valid options: stdio, streamable-http, sse
MCP_TRANSPORT=stdio
# HTTP/SSE Configuration (when not using stdio)
MCP_HOST=127.0.0.1
MCP_PORT=8000
# Streamable HTTP specific
MCP_PATH=/mcp
# SSE specific
MCP_SSE_PATH=/sse
""")
print(f"Created .env.example at: {env_example_path}")
# Main execution entry point
if __name__ == '__main__':
# Check prerequisites
python_ok, uv_installed, uvx_installed, word_server_installed = check_prerequisites()
if not python_ok:
print("Error: Python 3.8 or higher is required.")
sys.exit(1)
print("Word Document MCP Server Setup (Multi-Transport)")
print("===============================================\n")
# Create necessary files
create_package_structure()
# Get transport configuration
transport_config = get_transport_choice()
# If word-document-server is already installed, offer config options
if word_server_installed:
print("word-document-server is already installed via pip.")
if uvx_installed:
print("\nOptions:")
print("1. Generate MCP config for UVX (recommended)")
print("2. Generate MCP config for Python module")
print("3. Set up local development environment")
choice = input("\nEnter your choice (1-3): ")
if choice == "1":
config_path = generate_mcp_config_uvx(transport_config)
print_config_instructions(config_path, transport_config)
elif choice == "2":
config_path = generate_mcp_config_module(transport_config)
print_config_instructions(config_path, transport_config)
elif choice == "3":
python_path = setup_venv()
config_path = generate_mcp_config_local(python_path, transport_config)
print_config_instructions(config_path, transport_config)
else:
print("Invalid choice. Exiting.")
sys.exit(1)
else:
print("\nOptions:")
print("1. Generate MCP config for Python module")
print("2. Set up local development environment")
choice = input("\nEnter your choice (1-2): ")
if choice == "1":
config_path = generate_mcp_config_module(transport_config)
print_config_instructions(config_path, transport_config)
elif choice == "2":
python_path = setup_venv()
config_path = generate_mcp_config_local(python_path, transport_config)
print_config_instructions(config_path, transport_config)
else:
print("Invalid choice. Exiting.")
sys.exit(1)
# If word-document-server is not installed, offer installation options
else:
print("word-document-server is not installed.")
print("\nOptions:")
print("1. Install from PyPI (recommended)")
print("2. Set up local development environment")
choice = input("\nEnter your choice (1-2): ")
if choice == "1":
if install_from_pypi():
if uvx_installed:
print("\nNow generating MCP config for UVX...")
config_path = generate_mcp_config_uvx(transport_config)
else:
print("\nUVX not found. Generating MCP config for Python module...")
config_path = generate_mcp_config_module(transport_config)
print_config_instructions(config_path, transport_config)
elif choice == "2":
python_path = setup_venv()
config_path = generate_mcp_config_local(python_path, transport_config)
print_config_instructions(config_path, transport_config)
else:
print("Invalid choice. Exiting.")
sys.exit(1)
print("\nSetup complete! You can now use the Word Document MCP server with compatible clients like Claude Desktop.")
print("\nTransport Summary:")
print(f" - Transport: {transport_config['transport']}")
if transport_config['transport'] != 'stdio':
print(f" - Host: {transport_config.get('host', 'N/A')}")
print(f" - Port: {transport_config.get('port', 'N/A')}")
if transport_config['transport'] == 'streamable-http':
print(f" - Path: {transport_config.get('path', 'N/A')}")
elif transport_config['transport'] == 'sse':
print(f" - SSE Path: {transport_config.get('sse_path', 'N/A')}")
```
--------------------------------------------------------------------------------
/word_document_server/utils/document_utils.py:
--------------------------------------------------------------------------------
```python
"""
Document utility functions for Word Document Server.
"""
import json
from typing import Dict, List, Any
from docx import Document
from docx.oxml.table import CT_Tbl
from docx.oxml.text.paragraph import CT_P
from docx.oxml.ns import qn
from docx.oxml import OxmlElement
def get_document_properties(doc_path: str) -> Dict[str, Any]:
"""Get properties of a Word document."""
import os
if not os.path.exists(doc_path):
return {"error": f"Document {doc_path} does not exist"}
try:
doc = Document(doc_path)
core_props = doc.core_properties
return {
"title": core_props.title or "",
"author": core_props.author or "",
"subject": core_props.subject or "",
"keywords": core_props.keywords or "",
"created": str(core_props.created) if core_props.created else "",
"modified": str(core_props.modified) if core_props.modified else "",
"last_modified_by": core_props.last_modified_by or "",
"revision": core_props.revision or 0,
"page_count": len(doc.sections),
"word_count": sum(len(paragraph.text.split()) for paragraph in doc.paragraphs),
"paragraph_count": len(doc.paragraphs),
"table_count": len(doc.tables)
}
except Exception as e:
return {"error": f"Failed to get document properties: {str(e)}"}
def extract_document_text(doc_path: str) -> str:
"""Extract all text from a Word document."""
import os
if not os.path.exists(doc_path):
return f"Document {doc_path} does not exist"
try:
doc = Document(doc_path)
text = []
for paragraph in doc.paragraphs:
text.append(paragraph.text)
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
text.append(paragraph.text)
return "\n".join(text)
except Exception as e:
return f"Failed to extract text: {str(e)}"
def get_document_structure(doc_path: str) -> Dict[str, Any]:
"""Get the structure of a Word document."""
import os
if not os.path.exists(doc_path):
return {"error": f"Document {doc_path} does not exist"}
try:
doc = Document(doc_path)
structure = {
"paragraphs": [],
"tables": []
}
# Get paragraphs
for i, para in enumerate(doc.paragraphs):
structure["paragraphs"].append({
"index": i,
"text": para.text[:100] + ("..." if len(para.text) > 100 else ""),
"style": para.style.name if para.style else "Normal"
})
# Get tables
for i, table in enumerate(doc.tables):
table_data = {
"index": i,
"rows": len(table.rows),
"columns": len(table.columns),
"preview": []
}
# Get sample of table data
max_rows = min(3, len(table.rows))
for row_idx in range(max_rows):
row_data = []
max_cols = min(3, len(table.columns))
for col_idx in range(max_cols):
try:
cell_text = table.cell(row_idx, col_idx).text
row_data.append(cell_text[:20] + ("..." if len(cell_text) > 20 else ""))
except IndexError:
row_data.append("N/A")
table_data["preview"].append(row_data)
structure["tables"].append(table_data)
return structure
except Exception as e:
return {"error": f"Failed to get document structure: {str(e)}"}
def find_paragraph_by_text(doc, text, partial_match=False):
"""
Find paragraphs containing specific text.
Args:
doc: Document object
text: Text to search for
partial_match: If True, matches paragraphs containing the text; if False, matches exact text
Returns:
List of paragraph indices that match the criteria
"""
matching_paragraphs = []
for i, para in enumerate(doc.paragraphs):
if partial_match and text in para.text:
matching_paragraphs.append(i)
elif not partial_match and para.text == text:
matching_paragraphs.append(i)
return matching_paragraphs
def find_and_replace_text(doc, old_text, new_text):
"""
Find and replace text throughout the document, skipping Table of Contents (TOC) paragraphs.
Args:
doc: Document object
old_text: Text to find
new_text: Text to replace with
Returns:
Number of replacements made
"""
count = 0
# Search in paragraphs
for para in doc.paragraphs:
# Skip TOC paragraphs
if para.style and para.style.name.startswith("TOC"):
continue
if old_text in para.text:
for run in para.runs:
if old_text in run.text:
run.text = run.text.replace(old_text, new_text)
count += 1
# Search in tables
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
for para in cell.paragraphs:
# Skip TOC paragraphs in tables
if para.style and para.style.name.startswith("TOC"):
continue
if old_text in para.text:
for run in para.runs:
if old_text in run.text:
run.text = run.text.replace(old_text, new_text)
count += 1
return count
def get_document_xml(doc_path: str) -> str:
"""Extract and return the raw XML structure of the Word document (word/document.xml)."""
import os
import zipfile
if not os.path.exists(doc_path):
return f"Document {doc_path} does not exist"
try:
with zipfile.ZipFile(doc_path) as docx_zip:
with docx_zip.open('word/document.xml') as xml_file:
return xml_file.read().decode('utf-8')
except Exception as e:
return f"Failed to extract XML: {str(e)}"
def insert_header_near_text(doc_path: str, target_text: str = None, header_title: str = "", position: str = 'after', header_style: str = 'Heading 1', target_paragraph_index: int = None) -> str:
"""Insert a header (with specified style) before or after the target paragraph. Specify by text or paragraph index. Skips TOC paragraphs in text search."""
import os
from docx import Document
if not os.path.exists(doc_path):
return f"Document {doc_path} does not exist"
try:
doc = Document(doc_path)
found = False
para = None
if target_paragraph_index is not None:
if target_paragraph_index < 0 or target_paragraph_index >= len(doc.paragraphs):
return f"Invalid target_paragraph_index: {target_paragraph_index}. Document has {len(doc.paragraphs)} paragraphs."
para = doc.paragraphs[target_paragraph_index]
found = True
else:
for i, p in enumerate(doc.paragraphs):
# Skip TOC paragraphs
if p.style and p.style.name.lower().startswith("toc"):
continue
if target_text and target_text in p.text:
para = p
found = True
break
if not found or para is None:
return f"Target paragraph not found (by index or text). (TOC paragraphs are skipped in text search)"
# Save anchor index before insertion
if target_paragraph_index is not None:
anchor_index = target_paragraph_index
else:
anchor_index = None
for i, p in enumerate(doc.paragraphs):
if p is para:
anchor_index = i
break
new_para = doc.add_paragraph(header_title, style=header_style)
if position == 'before':
para._element.addprevious(new_para._element)
else:
para._element.addnext(new_para._element)
doc.save(doc_path)
if anchor_index is not None:
return f"Header '{header_title}' (style: {header_style}) inserted {position} paragraph (index {anchor_index})."
else:
return f"Header '{header_title}' (style: {header_style}) inserted {position} the target paragraph."
except Exception as e:
return f"Failed to insert header: {str(e)}"
def insert_line_or_paragraph_near_text(doc_path: str, target_text: str = None, line_text: str = "", position: str = 'after', line_style: str = None, target_paragraph_index: int = None) -> str:
"""
Insert a new line or paragraph (with specified or matched style) before or after the target paragraph.
You can specify the target by text (first match) or by paragraph index.
Skips paragraphs whose style name starts with 'TOC' if using text search.
"""
import os
from docx import Document
if not os.path.exists(doc_path):
return f"Document {doc_path} does not exist"
try:
doc = Document(doc_path)
found = False
para = None
if target_paragraph_index is not None:
if target_paragraph_index < 0 or target_paragraph_index >= len(doc.paragraphs):
return f"Invalid target_paragraph_index: {target_paragraph_index}. Document has {len(doc.paragraphs)} paragraphs."
para = doc.paragraphs[target_paragraph_index]
found = True
else:
for i, p in enumerate(doc.paragraphs):
# Skip TOC paragraphs
if p.style and p.style.name.lower().startswith("toc"):
continue
if target_text and target_text in p.text:
para = p
found = True
break
if not found or para is None:
return f"Target paragraph not found (by index or text). (TOC paragraphs are skipped in text search)"
# Save anchor index before insertion
if target_paragraph_index is not None:
anchor_index = target_paragraph_index
else:
anchor_index = None
for i, p in enumerate(doc.paragraphs):
if p is para:
anchor_index = i
break
# Determine style: use provided or match target
style = line_style if line_style else para.style
new_para = doc.add_paragraph(line_text, style=style)
if position == 'before':
para._element.addprevious(new_para._element)
else:
para._element.addnext(new_para._element)
doc.save(doc_path)
if anchor_index is not None:
return f"Line/paragraph inserted {position} paragraph (index {anchor_index}) with style '{style}'."
else:
return f"Line/paragraph inserted {position} the target paragraph with style '{style}'."
except Exception as e:
return f"Failed to insert line/paragraph: {str(e)}"
def add_bullet_numbering(paragraph, num_id=1, level=0):
"""
Add bullet/numbering XML to a paragraph.
Args:
paragraph: python-docx Paragraph object
num_id: Numbering definition ID (1=bullets, 2=numbers, etc.)
level: Indentation level (0=first level, 1=second level, etc.)
Returns:
The modified paragraph
"""
# Get or create paragraph properties
pPr = paragraph._element.get_or_add_pPr()
# Remove existing numPr if any (to avoid duplicates)
existing_numPr = pPr.find(qn('w:numPr'))
if existing_numPr is not None:
pPr.remove(existing_numPr)
# Create numbering properties element
numPr = OxmlElement('w:numPr')
# Set indentation level
ilvl = OxmlElement('w:ilvl')
ilvl.set(qn('w:val'), str(level))
numPr.append(ilvl)
# Set numbering definition ID
numId = OxmlElement('w:numId')
numId.set(qn('w:val'), str(num_id))
numPr.append(numId)
# Add to paragraph properties
pPr.append(numPr)
return paragraph
def insert_numbered_list_near_text(doc_path: str, target_text: str = None, list_items: list = None, position: str = 'after', target_paragraph_index: int = None, bullet_type: str = 'bullet') -> str:
"""
Insert a bulleted or numbered list before or after the target paragraph. Specify by text or paragraph index. Skips TOC paragraphs in text search.
Args:
doc_path: Path to the Word document
target_text: Text to search for in paragraphs (optional if using index)
list_items: List of strings, each as a list item
position: 'before' or 'after' (default: 'after')
target_paragraph_index: Optional paragraph index to use as anchor
bullet_type: 'bullet' for bullets (•), 'number' for numbers (1,2,3) (default: 'bullet')
Returns:
Status message
"""
import os
from docx import Document
if not os.path.exists(doc_path):
return f"Document {doc_path} does not exist"
try:
doc = Document(doc_path)
found = False
para = None
if target_paragraph_index is not None:
if target_paragraph_index < 0 or target_paragraph_index >= len(doc.paragraphs):
return f"Invalid target_paragraph_index: {target_paragraph_index}. Document has {len(doc.paragraphs)} paragraphs."
para = doc.paragraphs[target_paragraph_index]
found = True
else:
for i, p in enumerate(doc.paragraphs):
# Skip TOC paragraphs
if p.style and p.style.name.lower().startswith("toc"):
continue
if target_text and target_text in p.text:
para = p
found = True
break
if not found or para is None:
return f"Target paragraph not found (by index or text). (TOC paragraphs are skipped in text search)"
# Save anchor index before insertion
if target_paragraph_index is not None:
anchor_index = target_paragraph_index
else:
anchor_index = None
for i, p in enumerate(doc.paragraphs):
if p is para:
anchor_index = i
break
# Determine numbering ID based on bullet_type
num_id = 1 if bullet_type == 'bullet' else 2
# Use ListParagraph style for proper list formatting
style_name = None
for candidate in ['List Paragraph', 'ListParagraph', 'Normal']:
try:
_ = doc.styles[candidate]
style_name = candidate
break
except KeyError:
continue
if not style_name:
style_name = None # fallback to default
new_paras = []
for item in (list_items or []):
p = doc.add_paragraph(item, style=style_name)
# Add bullet numbering XML - this is the fix!
add_bullet_numbering(p, num_id=num_id, level=0)
new_paras.append(p)
# Move the new paragraphs to the correct position
for p in reversed(new_paras):
if position == 'before':
para._element.addprevious(p._element)
else:
para._element.addnext(p._element)
doc.save(doc_path)
list_type = "bulleted" if bullet_type == 'bullet' else "numbered"
if anchor_index is not None:
return f"{list_type.capitalize()} list with {len(new_paras)} items inserted {position} paragraph (index {anchor_index})."
else:
return f"{list_type.capitalize()} list with {len(new_paras)} items inserted {position} the target paragraph."
except Exception as e:
return f"Failed to insert numbered list: {str(e)}"
def is_toc_paragraph(para):
"""Devuelve True si el párrafo tiene un estilo de tabla de contenido (TOC)."""
return para.style and para.style.name.upper().startswith("TOC")
def is_heading_paragraph(para):
"""Devuelve True si el párrafo tiene un estilo de encabezado (Heading 1, Heading 2, etc)."""
return para.style and para.style.name.lower().startswith("heading")
# --- Helper: Get style name from a <w:p> element ---
def get_paragraph_style(el):
from docx.oxml.ns import qn
pPr = el.find(qn('w:pPr'))
if pPr is not None:
pStyle = pPr.find(qn('w:pStyle'))
if pStyle is not None and 'w:val' in pStyle.attrib:
return pStyle.attrib['w:val']
return None
# --- Main: Delete everything under a header until next heading/TOC ---
def delete_block_under_header(doc, header_text):
"""
Remove all elements (paragraphs, tables, etc.) after the header (by text) and before the next heading/TOC (by style).
Returns: (header_element, elements_removed)
"""
# Find the header paragraph by text (like delete_paragraph finds by index)
header_para = None
header_idx = None
for i, para in enumerate(doc.paragraphs):
if para.text.strip().lower() == header_text.strip().lower():
header_para = para
header_idx = i
break
if header_para is None:
return None, 0
# Find the next heading/TOC paragraph to determine the end of the block
end_idx = None
for i in range(header_idx + 1, len(doc.paragraphs)):
para = doc.paragraphs[i]
if para.style and para.style.name.lower().startswith(('heading', 'título', 'toc')):
end_idx = i
break
# If no next heading found, delete until end of document
if end_idx is None:
end_idx = len(doc.paragraphs)
# Remove paragraphs by index (like delete_paragraph does)
removed_count = 0
for i in range(header_idx + 1, end_idx):
if i < len(doc.paragraphs): # Safety check
para = doc.paragraphs[header_idx + 1] # Always remove the first paragraph after header
p = para._p
p.getparent().remove(p)
removed_count += 1
return header_para._p, removed_count
# --- Usage in replace_paragraph_block_below_header ---
def replace_paragraph_block_below_header(
doc_path: str,
header_text: str,
new_paragraphs: list,
detect_block_end_fn=None,
new_paragraph_style: str = None
) -> str:
"""
Reemplaza todo el contenido debajo de una cabecera (por texto), hasta el siguiente encabezado/TOC (por estilo).
"""
from docx import Document
import os
if not os.path.exists(doc_path):
return f"Document {doc_path} not found."
doc = Document(doc_path)
# Find the header paragraph first
header_para = None
header_idx = None
for i, para in enumerate(doc.paragraphs):
para_text = para.text.strip().lower()
is_toc = is_toc_paragraph(para)
if para_text == header_text.strip().lower() and not is_toc:
header_para = para
header_idx = i
break
if header_para is None:
return f"Header '{header_text}' not found in document."
# Delete everything under the header using the same document instance
header_el, removed_count = delete_block_under_header(doc, header_text)
# Now insert new paragraphs after the header (which should still be in the document)
style_to_use = new_paragraph_style or "Normal"
# Find the header again after deletion (it should still be there)
current_para = header_para
for text in new_paragraphs:
new_para = doc.add_paragraph(text, style=style_to_use)
current_para._element.addnext(new_para._element)
current_para = new_para
doc.save(doc_path)
return f"Replaced content under '{header_text}' with {len(new_paragraphs)} paragraph(s), style: {style_to_use}, removed {removed_count} elements."
def replace_block_between_manual_anchors(
doc_path: str,
start_anchor_text: str,
new_paragraphs: list,
end_anchor_text: str = None,
match_fn=None,
new_paragraph_style: str = None
) -> str:
"""
Replace all content (paragraphs, tables, etc.) between start_anchor_text and end_anchor_text (or next logical header if not provided).
If end_anchor_text is None, deletes until next visually distinct paragraph (bold, all caps, or different font size), or end of document.
Inserts new_paragraphs after the start anchor.
"""
from docx import Document
import os
if not os.path.exists(doc_path):
return f"Document {doc_path} not found."
doc = Document(doc_path)
body = doc.element.body
elements = list(body)
start_idx = None
end_idx = None
# Find start anchor
for i, el in enumerate(elements):
if el.tag == CT_P.tag:
p_text = "".join([node.text or '' for node in el.iter() if node.tag.endswith('}t')]).strip()
if match_fn:
if match_fn(p_text, el):
start_idx = i
break
elif p_text == start_anchor_text.strip():
start_idx = i
break
if start_idx is None:
return f"Start anchor '{start_anchor_text}' not found."
# Find end anchor
if end_anchor_text:
for i in range(start_idx + 1, len(elements)):
el = elements[i]
if el.tag == CT_P.tag:
p_text = "".join([node.text or '' for node in el.iter() if node.tag.endswith('}t')]).strip()
if match_fn:
if match_fn(p_text, el, is_end=True):
end_idx = i
break
elif p_text == end_anchor_text.strip():
end_idx = i
break
else:
# Heuristic: next visually distinct paragraph (bold, all caps, or different font size), or end of document
for i in range(start_idx + 1, len(elements)):
el = elements[i]
if el.tag == CT_P.tag:
# Check for bold, all caps, or font size
runs = [node for node in el.iter() if node.tag.endswith('}r')]
for run in runs:
rpr = run.find(qn('w:rPr'))
if rpr is not None:
if rpr.find(qn('w:b')) is not None or rpr.find(qn('w:caps')) is not None or rpr.find(qn('w:sz')) is not None:
end_idx = i
break
if end_idx is not None:
break
# Mark elements for removal
to_remove = []
for i in range(start_idx + 1, end_idx if end_idx is not None else len(elements)):
to_remove.append(elements[i])
for el in to_remove:
body.remove(el)
doc.save(doc_path)
# Reload and find start anchor for insertion
doc = Document(doc_path)
paras = doc.paragraphs
anchor_idx = None
for i, para in enumerate(paras):
if para.text.strip() == start_anchor_text.strip():
anchor_idx = i
break
if anchor_idx is None:
return f"Start anchor '{start_anchor_text}' not found after deletion (unexpected)."
anchor_para = paras[anchor_idx]
style_to_use = new_paragraph_style or "Normal"
for text in new_paragraphs:
new_para = doc.add_paragraph(text, style=style_to_use)
anchor_para._element.addnext(new_para._element)
anchor_para = new_para
doc.save(doc_path)
return f"Replaced content between '{start_anchor_text}' and '{end_anchor_text or 'next logical header'}' with {len(new_paragraphs)} paragraph(s), style: {style_to_use}, removed {len(to_remove)} elements."
```
--------------------------------------------------------------------------------
/word_document_server/main.py:
--------------------------------------------------------------------------------
```python
"""
Main entry point for the Word Document MCP Server.
Acts as the central controller for the MCP server that handles Word document operations.
Supports multiple transports: stdio, sse, and streamable-http using standalone FastMCP.
"""
import os
import sys
from dotenv import load_dotenv
# Load environment variables from .env file
print("Loading configuration from .env file...")
load_dotenv()
# Set required environment variable for FastMCP 2.8.1+
os.environ.setdefault('FASTMCP_LOG_LEVEL', 'INFO')
from fastmcp import FastMCP
from word_document_server.tools import (
document_tools,
content_tools,
format_tools,
protection_tools,
footnote_tools,
extended_document_tools,
comment_tools
)
from word_document_server.tools.content_tools import replace_paragraph_block_below_header_tool
from word_document_server.tools.content_tools import replace_block_between_manual_anchors_tool
def get_transport_config():
"""
Get transport configuration from environment variables.
Returns:
dict: Transport configuration with type, host, port, and other settings
"""
# Default configuration
config = {
'transport': 'stdio', # Default to stdio for backward compatibility
'host': '0.0.0.0',
'port': 8000,
'path': '/mcp',
'sse_path': '/sse'
}
# Override with environment variables if provided
transport = os.getenv('MCP_TRANSPORT', 'stdio').lower()
print(f"Transport: {transport}")
# Validate transport type
valid_transports = ['stdio', 'streamable-http', 'sse']
if transport not in valid_transports:
print(f"Warning: Invalid transport '{transport}'. Falling back to 'stdio'.")
transport = 'stdio'
config['transport'] = transport
config['host'] = os.getenv('MCP_HOST', config['host'])
# Use PORT from Render if available, otherwise fall back to MCP_PORT or default
config['port'] = int(os.getenv('PORT', os.getenv('MCP_PORT', config['port'])))
config['path'] = os.getenv('MCP_PATH', config['path'])
config['sse_path'] = os.getenv('MCP_SSE_PATH', config['sse_path'])
return config
def setup_logging(debug_mode):
"""
Setup logging based on debug mode.
Args:
debug_mode (bool): Whether to enable debug logging
"""
import logging
if debug_mode:
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
print("Debug logging enabled")
else:
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
# Initialize FastMCP server
mcp = FastMCP("Word Document Server")
def register_tools():
"""Register all tools with the MCP server using FastMCP decorators."""
# Document tools (create, copy, info, etc.)
@mcp.tool()
def create_document(filename: str, title: str = None, author: str = None):
"""Create a new Word document with optional metadata."""
return document_tools.create_document(filename, title, author)
@mcp.tool()
def copy_document(source_filename: str, destination_filename: str = None):
"""Create a copy of a Word document."""
return document_tools.copy_document(source_filename, destination_filename)
@mcp.tool()
def get_document_info(filename: str):
"""Get information about a Word document."""
return document_tools.get_document_info(filename)
@mcp.tool()
def get_document_text(filename: str):
"""Extract all text from a Word document."""
return document_tools.get_document_text(filename)
@mcp.tool()
def get_document_outline(filename: str):
"""Get the structure of a Word document."""
return document_tools.get_document_outline(filename)
@mcp.tool()
def list_available_documents(directory: str = "."):
"""List all .docx files in the specified directory."""
return document_tools.list_available_documents(directory)
@mcp.tool()
def get_document_xml(filename: str):
"""Get the raw XML structure of a Word document."""
return document_tools.get_document_xml_tool(filename)
@mcp.tool()
def insert_header_near_text(filename: str, target_text: str = None, header_title: str = None, position: str = 'after', header_style: str = 'Heading 1', target_paragraph_index: int = None):
"""Insert a header (with specified style) before or after the target paragraph. Specify by text or paragraph index. Args: filename (str), target_text (str, optional), header_title (str), position ('before' or 'after'), header_style (str, default 'Heading 1'), target_paragraph_index (int, optional)."""
return content_tools.insert_header_near_text_tool(filename, target_text, header_title, position, header_style, target_paragraph_index)
@mcp.tool()
def insert_line_or_paragraph_near_text(filename: str, target_text: str = None, line_text: str = None, position: str = 'after', line_style: str = None, target_paragraph_index: int = None):
"""
Insert a new line or paragraph (with specified or matched style) before or after the target paragraph. Specify by text or paragraph index. Args: filename (str), target_text (str, optional), line_text (str), position ('before' or 'after'), line_style (str, optional), target_paragraph_index (int, optional).
"""
return content_tools.insert_line_or_paragraph_near_text_tool(filename, target_text, line_text, position, line_style, target_paragraph_index)
@mcp.tool()
def insert_numbered_list_near_text(filename: str, target_text: str = None, list_items: list = None, position: str = 'after', target_paragraph_index: int = None, bullet_type: str = 'bullet'):
"""Insert a bulleted or numbered list before or after the target paragraph. Specify by text or paragraph index. Args: filename (str), target_text (str, optional), list_items (list of str), position ('before' or 'after'), target_paragraph_index (int, optional), bullet_type ('bullet' for bullets or 'number' for numbered lists, default: 'bullet')."""
return content_tools.insert_numbered_list_near_text_tool(filename, target_text, list_items, position, target_paragraph_index, bullet_type)
# Content tools (paragraphs, headings, tables, etc.)
@mcp.tool()
def add_paragraph(filename: str, text: str, style: str = None,
font_name: str = None, font_size: int = None,
bold: bool = None, italic: bool = None, color: str = None):
"""Add a paragraph to a Word document with optional formatting.
Args:
filename: Path to Word document
text: Paragraph text content
style: Optional paragraph style name
font_name: Font family (e.g., 'Helvetica', 'Times New Roman')
font_size: Font size in points (e.g., 14, 36)
bold: Make text bold
italic: Make text italic
color: Text color as hex RGB (e.g., '000000')
"""
return content_tools.add_paragraph(filename, text, style, font_name, font_size, bold, italic, color)
@mcp.tool()
def add_heading(filename: str, text: str, level: int = 1,
font_name: str = None, font_size: int = None,
bold: bool = None, italic: bool = None, border_bottom: bool = False):
"""Add a heading to a Word document with optional formatting.
Args:
filename: Path to Word document
text: Heading text
level: Heading level (1-9)
font_name: Font family (e.g., 'Helvetica')
font_size: Font size in points (e.g., 14)
bold: Make heading bold
italic: Make heading italic
border_bottom: Add bottom border (for section headers)
"""
return content_tools.add_heading(filename, text, level, font_name, font_size, bold, italic, border_bottom)
@mcp.tool()
def add_picture(filename: str, image_path: str, width: float = None):
"""Add an image to a Word document."""
return content_tools.add_picture(filename, image_path, width)
@mcp.tool()
def add_table(filename: str, rows: int, cols: int, data: list = None):
"""Add a table to a Word document."""
return content_tools.add_table(filename, rows, cols, data)
@mcp.tool()
def add_page_break(filename: str):
"""Add a page break to the document."""
return content_tools.add_page_break(filename)
@mcp.tool()
def delete_paragraph(filename: str, paragraph_index: int):
"""Delete a paragraph from a document."""
return content_tools.delete_paragraph(filename, paragraph_index)
@mcp.tool()
def search_and_replace(filename: str, find_text: str, replace_text: str):
"""Search for text and replace all occurrences."""
return content_tools.search_and_replace(filename, find_text, replace_text)
# Format tools (styling, text formatting, etc.)
@mcp.tool()
def create_custom_style(filename: str, style_name: str, bold: bool = None,
italic: bool = None, font_size: int = None,
font_name: str = None, color: str = None,
base_style: str = None):
"""Create a custom style in the document."""
return format_tools.create_custom_style(
filename, style_name, bold, italic, font_size, font_name, color, base_style
)
@mcp.tool()
def format_text(filename: str, paragraph_index: int, start_pos: int, end_pos: int,
bold: bool = None, italic: bool = None, underline: bool = None,
color: str = None, font_size: int = None, font_name: str = None):
"""Format a specific range of text within a paragraph."""
return format_tools.format_text(
filename, paragraph_index, start_pos, end_pos, bold, italic,
underline, color, font_size, font_name
)
@mcp.tool()
def format_table(filename: str, table_index: int, has_header_row: bool = None,
border_style: str = None, shading: list = None):
"""Format a table with borders, shading, and structure."""
return format_tools.format_table(filename, table_index, has_header_row, border_style, shading)
# New table cell shading tools
@mcp.tool()
def set_table_cell_shading(filename: str, table_index: int, row_index: int,
col_index: int, fill_color: str, pattern: str = "clear"):
"""Apply shading/filling to a specific table cell."""
return format_tools.set_table_cell_shading(filename, table_index, row_index, col_index, fill_color, pattern)
@mcp.tool()
def apply_table_alternating_rows(filename: str, table_index: int,
color1: str = "FFFFFF", color2: str = "F2F2F2"):
"""Apply alternating row colors to a table for better readability."""
return format_tools.apply_table_alternating_rows(filename, table_index, color1, color2)
@mcp.tool()
def highlight_table_header(filename: str, table_index: int,
header_color: str = "4472C4", text_color: str = "FFFFFF"):
"""Apply special highlighting to table header row."""
return format_tools.highlight_table_header(filename, table_index, header_color, text_color)
# Cell merging tools
@mcp.tool()
def merge_table_cells(filename: str, table_index: int, start_row: int, start_col: int,
end_row: int, end_col: int):
"""Merge cells in a rectangular area of a table."""
return format_tools.merge_table_cells(filename, table_index, start_row, start_col, end_row, end_col)
@mcp.tool()
def merge_table_cells_horizontal(filename: str, table_index: int, row_index: int,
start_col: int, end_col: int):
"""Merge cells horizontally in a single row."""
return format_tools.merge_table_cells_horizontal(filename, table_index, row_index, start_col, end_col)
@mcp.tool()
def merge_table_cells_vertical(filename: str, table_index: int, col_index: int,
start_row: int, end_row: int):
"""Merge cells vertically in a single column."""
return format_tools.merge_table_cells_vertical(filename, table_index, col_index, start_row, end_row)
# Cell alignment tools
@mcp.tool()
def set_table_cell_alignment(filename: str, table_index: int, row_index: int, col_index: int,
horizontal: str = "left", vertical: str = "top"):
"""Set text alignment for a specific table cell."""
return format_tools.set_table_cell_alignment(filename, table_index, row_index, col_index, horizontal, vertical)
@mcp.tool()
def set_table_alignment_all(filename: str, table_index: int,
horizontal: str = "left", vertical: str = "top"):
"""Set text alignment for all cells in a table."""
return format_tools.set_table_alignment_all(filename, table_index, horizontal, vertical)
# Protection tools
@mcp.tool()
def protect_document(filename: str, password: str):
"""Add password protection to a Word document."""
return protection_tools.protect_document(filename, password)
@mcp.tool()
def unprotect_document(filename: str, password: str):
"""Remove password protection from a Word document."""
return protection_tools.unprotect_document(filename, password)
# Footnote tools
@mcp.tool()
def add_footnote_to_document(filename: str, paragraph_index: int, footnote_text: str):
"""Add a footnote to a specific paragraph in a Word document."""
return footnote_tools.add_footnote_to_document(filename, paragraph_index, footnote_text)
@mcp.tool()
def add_footnote_after_text(filename: str, search_text: str, footnote_text: str,
output_filename: str = None):
"""Add a footnote after specific text with proper superscript formatting.
This enhanced function ensures footnotes display correctly as superscript."""
return footnote_tools.add_footnote_after_text(filename, search_text, footnote_text, output_filename)
@mcp.tool()
def add_footnote_before_text(filename: str, search_text: str, footnote_text: str,
output_filename: str = None):
"""Add a footnote before specific text with proper superscript formatting.
This enhanced function ensures footnotes display correctly as superscript."""
return footnote_tools.add_footnote_before_text(filename, search_text, footnote_text, output_filename)
@mcp.tool()
def add_footnote_enhanced(filename: str, paragraph_index: int, footnote_text: str,
output_filename: str = None):
"""Enhanced footnote addition with guaranteed superscript formatting.
Adds footnote at the end of a specific paragraph with proper style handling."""
return footnote_tools.add_footnote_enhanced(filename, paragraph_index, footnote_text, output_filename)
@mcp.tool()
def add_endnote_to_document(filename: str, paragraph_index: int, endnote_text: str):
"""Add an endnote to a specific paragraph in a Word document."""
return footnote_tools.add_endnote_to_document(filename, paragraph_index, endnote_text)
@mcp.tool()
def customize_footnote_style(filename: str, numbering_format: str = "1, 2, 3",
start_number: int = 1, font_name: str = None,
font_size: int = None):
"""Customize footnote numbering and formatting in a Word document."""
return footnote_tools.customize_footnote_style(
filename, numbering_format, start_number, font_name, font_size
)
@mcp.tool()
def delete_footnote_from_document(filename: str, footnote_id: int = None,
search_text: str = None, output_filename: str = None):
"""Delete a footnote from a Word document.
Identify the footnote either by ID (1, 2, 3, etc.) or by searching for text near it."""
return footnote_tools.delete_footnote_from_document(
filename, footnote_id, search_text, output_filename
)
# Robust footnote tools - Production-ready with comprehensive validation
@mcp.tool()
def add_footnote_robust(filename: str, search_text: str = None,
paragraph_index: int = None, footnote_text: str = "",
validate_location: bool = True, auto_repair: bool = False):
"""Add footnote with robust validation and Word compliance.
This is the production-ready version with comprehensive error handling."""
return footnote_tools.add_footnote_robust_tool(
filename, search_text, paragraph_index, footnote_text,
validate_location, auto_repair
)
@mcp.tool()
def validate_document_footnotes(filename: str):
"""Validate all footnotes in document for coherence and compliance.
Returns detailed report on ID conflicts, orphaned content, missing styles, etc."""
return footnote_tools.validate_footnotes_tool(filename)
@mcp.tool()
def delete_footnote_robust(filename: str, footnote_id: int = None,
search_text: str = None, clean_orphans: bool = True):
"""Delete footnote with comprehensive cleanup and orphan removal.
Ensures complete removal from document.xml, footnotes.xml, and relationships."""
return footnote_tools.delete_footnote_robust_tool(
filename, footnote_id, search_text, clean_orphans
)
# Extended document tools
@mcp.tool()
def get_paragraph_text_from_document(filename: str, paragraph_index: int):
"""Get text from a specific paragraph in a Word document."""
return extended_document_tools.get_paragraph_text_from_document(filename, paragraph_index)
@mcp.tool()
def find_text_in_document(filename: str, text_to_find: str, match_case: bool = True,
whole_word: bool = False):
"""Find occurrences of specific text in a Word document."""
return extended_document_tools.find_text_in_document(
filename, text_to_find, match_case, whole_word
)
@mcp.tool()
def convert_to_pdf(filename: str, output_filename: str = None):
"""Convert a Word document to PDF format."""
return extended_document_tools.convert_to_pdf(filename, output_filename)
@mcp.tool()
def replace_paragraph_block_below_header(filename: str, header_text: str, new_paragraphs: list, detect_block_end_fn=None):
"""Reemplaza el bloque de párrafos debajo de un encabezado, evitando modificar TOC."""
return replace_paragraph_block_below_header_tool(filename, header_text, new_paragraphs, detect_block_end_fn)
@mcp.tool()
def replace_block_between_manual_anchors(filename: str, start_anchor_text: str, new_paragraphs: list, end_anchor_text: str = None, match_fn=None, new_paragraph_style: str = None):
"""Replace all content between start_anchor_text and end_anchor_text (or next logical header if not provided)."""
return replace_block_between_manual_anchors_tool(filename, start_anchor_text, new_paragraphs, end_anchor_text, match_fn, new_paragraph_style)
# Comment tools
@mcp.tool()
def get_all_comments(filename: str):
"""Extract all comments from a Word document."""
return comment_tools.get_all_comments(filename)
@mcp.tool()
def get_comments_by_author(filename: str, author: str):
"""Extract comments from a specific author in a Word document."""
return comment_tools.get_comments_by_author(filename, author)
@mcp.tool()
def get_comments_for_paragraph(filename: str, paragraph_index: int):
"""Extract comments for a specific paragraph in a Word document."""
return comment_tools.get_comments_for_paragraph(filename, paragraph_index)
# New table column width tools
@mcp.tool()
def set_table_column_width(filename: str, table_index: int, col_index: int,
width: float, width_type: str = "points"):
"""Set the width of a specific table column."""
return format_tools.set_table_column_width(filename, table_index, col_index, width, width_type)
@mcp.tool()
def set_table_column_widths(filename: str, table_index: int, widths: list,
width_type: str = "points"):
"""Set the widths of multiple table columns."""
return format_tools.set_table_column_widths(filename, table_index, widths, width_type)
@mcp.tool()
def set_table_width(filename: str, table_index: int, width: float,
width_type: str = "points"):
"""Set the overall width of a table."""
return format_tools.set_table_width(filename, table_index, width, width_type)
@mcp.tool()
def auto_fit_table_columns(filename: str, table_index: int):
"""Set table columns to auto-fit based on content."""
return format_tools.auto_fit_table_columns(filename, table_index)
# New table cell text formatting and padding tools
@mcp.tool()
def format_table_cell_text(filename: str, table_index: int, row_index: int, col_index: int,
text_content: str = None, bold: bool = None, italic: bool = None,
underline: bool = None, color: str = None, font_size: int = None,
font_name: str = None):
"""Format text within a specific table cell."""
return format_tools.format_table_cell_text(filename, table_index, row_index, col_index,
text_content, bold, italic, underline, color, font_size, font_name)
@mcp.tool()
def set_table_cell_padding(filename: str, table_index: int, row_index: int, col_index: int,
top: float = None, bottom: float = None, left: float = None,
right: float = None, unit: str = "points"):
"""Set padding/margins for a specific table cell."""
return format_tools.set_table_cell_padding(filename, table_index, row_index, col_index,
top, bottom, left, right, unit)
def run_server():
"""Run the Word Document MCP Server with configurable transport."""
# Get transport configuration
config = get_transport_config()
# Setup logging
# setup_logging(config['debug'])
# Register all tools
register_tools()
# Print startup information
transport_type = config['transport']
print(f"Starting Word Document MCP Server with {transport_type} transport...")
# if config['debug']:
# print(f"Configuration: {config}")
try:
if transport_type == 'stdio':
# Run with stdio transport (default, backward compatible)
print("Server running on stdio transport")
mcp.run(transport='stdio')
elif transport_type == 'streamable-http':
# Run with streamable HTTP transport
print(f"Server running on streamable-http transport at http://{config['host']}:{config['port']}{config['path']}")
mcp.run(
transport='streamable-http',
host=config['host'],
port=config['port'],
path=config['path']
)
elif transport_type == 'sse':
# Run with SSE transport
print(f"Server running on SSE transport at http://{config['host']}:{config['port']}{config['sse_path']}")
mcp.run(
transport='sse',
host=config['host'],
port=config['port'],
path=config['sse_path']
)
except KeyboardInterrupt:
print("\nShutting down server...")
except Exception as e:
print(f"Error starting server: {e}")
if config['debug']:
import traceback
traceback.print_exc()
sys.exit(1)
return mcp
def main():
"""Main entry point for the server."""
run_server()
if __name__ == "__main__":
main()
```
--------------------------------------------------------------------------------
/word_document_server/tools/footnote_tools.py:
--------------------------------------------------------------------------------
```python
"""
Footnote and endnote tools for Word Document Server.
These tools handle footnote and endnote functionality,
including adding, customizing, and converting between them.
This module combines both standard and robust implementations:
- String-return functions for backward compatibility
- Dict-return robust functions for structured responses
"""
import os
from typing import Optional, Dict, Any
from docx import Document
from docx.shared import Pt
from docx.enum.style import WD_STYLE_TYPE
from word_document_server.utils.file_utils import check_file_writeable, ensure_docx_extension
from word_document_server.core.footnotes import (
find_footnote_references,
get_format_symbols,
customize_footnote_formatting,
add_footnote_robust,
delete_footnote_robust,
validate_document_footnotes,
add_footnote_at_paragraph_end # Compatibility function
)
async def add_footnote_to_document(filename: str, paragraph_index: int, footnote_text: str) -> str:
"""Add a footnote to a specific paragraph in a Word document.
Args:
filename: Path to the Word document
paragraph_index: Index of the paragraph to add footnote to (0-based)
footnote_text: Text content of the footnote
"""
filename = ensure_docx_extension(filename)
# Ensure paragraph_index is an integer
try:
paragraph_index = int(paragraph_index)
except (ValueError, TypeError):
return "Invalid parameter: paragraph_index must be an integer"
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
doc = Document(filename)
# Validate paragraph index
if paragraph_index < 0 or paragraph_index >= len(doc.paragraphs):
return f"Invalid paragraph index. Document has {len(doc.paragraphs)} paragraphs (0-{len(doc.paragraphs)-1})."
paragraph = doc.paragraphs[paragraph_index]
# In python-docx, we'd use paragraph.add_footnote(), but we'll use a more robust approach
try:
footnote = paragraph.add_run()
footnote.text = ""
# Create the footnote reference
reference = footnote.add_footnote(footnote_text)
doc.save(filename)
return f"Footnote added to paragraph {paragraph_index} in {filename}"
except AttributeError:
# Fall back to a simpler approach if direct footnote addition fails
last_run = paragraph.add_run()
last_run.text = "¹" # Unicode superscript 1
last_run.font.superscript = True
# Add a footnote section at the end if it doesn't exist
found_footnote_section = False
for p in doc.paragraphs:
if p.text.startswith("Footnotes:"):
found_footnote_section = True
break
if not found_footnote_section:
doc.add_paragraph("\n").add_run()
doc.add_paragraph("Footnotes:").bold = True
# Add footnote text
footnote_para = doc.add_paragraph("¹ " + footnote_text)
footnote_para.style = "Footnote Text" if "Footnote Text" in doc.styles else "Normal"
doc.save(filename)
return f"Footnote added to paragraph {paragraph_index} in {filename} (simplified approach)"
except Exception as e:
return f"Failed to add footnote: {str(e)}"
async def add_endnote_to_document(filename: str, paragraph_index: int, endnote_text: str) -> str:
"""Add an endnote to a specific paragraph in a Word document.
Args:
filename: Path to the Word document
paragraph_index: Index of the paragraph to add endnote to (0-based)
endnote_text: Text content of the endnote
"""
filename = ensure_docx_extension(filename)
# Ensure paragraph_index is an integer
try:
paragraph_index = int(paragraph_index)
except (ValueError, TypeError):
return "Invalid parameter: paragraph_index must be an integer"
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
doc = Document(filename)
# Validate paragraph index
if paragraph_index < 0 or paragraph_index >= len(doc.paragraphs):
return f"Invalid paragraph index. Document has {len(doc.paragraphs)} paragraphs (0-{len(doc.paragraphs)-1})."
paragraph = doc.paragraphs[paragraph_index]
# Add endnote reference
last_run = paragraph.add_run()
last_run.text = "†" # Unicode dagger symbol common for endnotes
last_run.font.superscript = True
# Check if endnotes section exists, if not create it
endnotes_heading_found = False
for para in doc.paragraphs:
if para.text == "Endnotes:" or para.text == "ENDNOTES":
endnotes_heading_found = True
break
if not endnotes_heading_found:
# Add a page break before endnotes section
doc.add_page_break()
doc.add_heading("Endnotes:", level=1)
# Add the endnote text
endnote_para = doc.add_paragraph("† " + endnote_text)
endnote_para.style = "Endnote Text" if "Endnote Text" in doc.styles else "Normal"
doc.save(filename)
return f"Endnote added to paragraph {paragraph_index} in {filename}"
except Exception as e:
return f"Failed to add endnote: {str(e)}"
async def convert_footnotes_to_endnotes_in_document(filename: str) -> str:
"""Convert all footnotes to endnotes in a Word document.
Args:
filename: Path to the Word document
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
doc = Document(filename)
# Find all runs that might be footnote references
footnote_references = []
for para_idx, para in enumerate(doc.paragraphs):
for run_idx, run in enumerate(para.runs):
# Check if this run is likely a footnote reference
# (superscript number or special character)
if run.font.superscript and (run.text.isdigit() or run.text in "¹²³⁴⁵⁶⁷⁸⁹"):
footnote_references.append({
"paragraph_index": para_idx,
"run_index": run_idx,
"text": run.text
})
if not footnote_references:
return f"No footnote references found in {filename}"
# Create endnotes section
doc.add_page_break()
doc.add_heading("Endnotes:", level=1)
# Create a placeholder for endnote content, we'll fill it later
endnote_content = []
# Find the footnote text at the bottom of the page
found_footnote_section = False
footnote_text = []
for para in doc.paragraphs:
if not found_footnote_section and para.text.startswith("Footnotes:"):
found_footnote_section = True
continue
if found_footnote_section:
footnote_text.append(para.text)
# Create endnotes based on footnote references
for i, ref in enumerate(footnote_references):
# Add a new endnote
endnote_para = doc.add_paragraph()
# Try to match with footnote text, or use placeholder
if i < len(footnote_text):
endnote_para.text = f"†{i+1} {footnote_text[i]}"
else:
endnote_para.text = f"†{i+1} Converted from footnote {ref['text']}"
# Change the footnote reference to an endnote reference
try:
paragraph = doc.paragraphs[ref["paragraph_index"]]
paragraph.runs[ref["run_index"]].text = f"†{i+1}"
except IndexError:
# Skip if we can't locate the reference
pass
# Save the document
doc.save(filename)
return f"Converted {len(footnote_references)} footnotes to endnotes in {filename}"
except Exception as e:
return f"Failed to convert footnotes to endnotes: {str(e)}"
async def add_footnote_after_text(filename: str, search_text: str, footnote_text: str,
output_filename: Optional[str] = None) -> str:
"""Add a footnote after specific text in a Word document with proper formatting.
This enhanced function ensures proper superscript formatting by managing styles at the XML level.
Args:
filename: Path to the Word document
search_text: Text to search for (footnote will be added after this text)
footnote_text: Content of the footnote
output_filename: Optional output filename (if None, modifies in place)
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
# Use robust implementation
success, message, details = add_footnote_robust(
filename=filename,
search_text=search_text,
footnote_text=footnote_text,
output_filename=output_filename,
position="after",
validate_location=True
)
return message
except Exception as e:
return f"Failed to add footnote: {str(e)}"
async def add_footnote_before_text(filename: str, search_text: str, footnote_text: str,
output_filename: Optional[str] = None) -> str:
"""Add a footnote before specific text in a Word document with proper formatting.
This enhanced function ensures proper superscript formatting by managing styles at the XML level.
Args:
filename: Path to the Word document
search_text: Text to search for (footnote will be added before this text)
footnote_text: Content of the footnote
output_filename: Optional output filename (if None, modifies in place)
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
# Use robust implementation
success, message, details = add_footnote_robust(
filename=filename,
search_text=search_text,
footnote_text=footnote_text,
output_filename=output_filename,
position="before",
validate_location=True
)
return message
except Exception as e:
return f"Failed to add footnote: {str(e)}"
async def add_footnote_enhanced(filename: str, paragraph_index: int, footnote_text: str,
output_filename: Optional[str] = None) -> str:
"""Enhanced version of add_footnote_to_document with proper superscript formatting.
Now uses the robust implementation for better reliability.
Args:
filename: Path to the Word document
paragraph_index: Index of the paragraph to add footnote to (0-based)
footnote_text: Text content of the footnote
output_filename: Optional output filename (if None, modifies in place)
"""
filename = ensure_docx_extension(filename)
# Ensure paragraph_index is an integer
try:
paragraph_index = int(paragraph_index)
except (ValueError, TypeError):
return "Invalid parameter: paragraph_index must be an integer"
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
# Use robust implementation
success, message, details = add_footnote_robust(
filename=filename,
paragraph_index=paragraph_index,
footnote_text=footnote_text,
output_filename=output_filename,
validate_location=True
)
return message
except Exception as e:
return f"Failed to add footnote: {str(e)}"
async def customize_footnote_style(filename: str, numbering_format: str = "1, 2, 3",
start_number: int = 1, font_name: Optional[str] = None,
font_size: Optional[int] = None) -> str:
"""Customize footnote numbering and formatting in a Word document.
Args:
filename: Path to the Word document
numbering_format: Format for footnote numbers (e.g., "1, 2, 3", "i, ii, iii", "a, b, c")
start_number: Number to start footnote numbering from
font_name: Optional font name for footnotes
font_size: Optional font size for footnotes (in points)
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
doc = Document(filename)
# Create or get footnote style
footnote_style_name = "Footnote Text"
footnote_style = None
try:
footnote_style = doc.styles[footnote_style_name]
except KeyError:
# Create the style if it doesn't exist
footnote_style = doc.styles.add_style(footnote_style_name, WD_STYLE_TYPE.PARAGRAPH)
# Apply formatting to footnote style
if footnote_style:
if font_name:
footnote_style.font.name = font_name
if font_size:
footnote_style.font.size = Pt(font_size)
# Find all existing footnote references
footnote_refs = find_footnote_references(doc)
# Generate format symbols for the specified numbering format
format_symbols = get_format_symbols(numbering_format, len(footnote_refs) + start_number)
# Apply custom formatting to footnotes
count = customize_footnote_formatting(doc, footnote_refs, format_symbols, start_number, footnote_style)
# Save the document
doc.save(filename)
return f"Footnote style and numbering customized in {filename}"
except Exception as e:
return f"Failed to customize footnote style: {str(e)}"
async def delete_footnote_from_document(filename: str, footnote_id: Optional[int] = None,
search_text: Optional[str] = None,
output_filename: Optional[str] = None) -> str:
"""Delete a footnote from a Word document.
You can identify the footnote to delete either by:
1. footnote_id: The numeric ID of the footnote (1, 2, 3, etc.)
2. search_text: Text near the footnote reference to find and delete
Args:
filename: Path to the Word document
footnote_id: Optional ID of the footnote to delete (1-based)
search_text: Optional text to search near the footnote reference
output_filename: Optional output filename (if None, modifies in place)
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return f"Document {filename} does not exist"
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return f"Cannot modify document: {error_message}. Consider creating a copy first."
try:
# Use robust implementation with orphan cleanup
success, message, details = delete_footnote_robust(
filename=filename,
footnote_id=footnote_id,
search_text=search_text,
output_filename=output_filename,
clean_orphans=True
)
return message
except Exception as e:
return f"Failed to delete footnote: {str(e)}"
# ============================================================================
# Robust tool functions with Dict returns for structured responses
# ============================================================================
async def add_footnote_robust_tool(
filename: str,
search_text: Optional[str] = None,
paragraph_index: Optional[int] = None,
footnote_text: str = "",
validate_location: bool = True,
auto_repair: bool = False
) -> Dict[str, Any]:
"""
Add a footnote with robust validation and error handling.
This is the production-ready version with comprehensive Word compliance.
Args:
filename: Path to the Word document
search_text: Text to search for (mutually exclusive with paragraph_index)
paragraph_index: Index of paragraph (mutually exclusive with search_text)
footnote_text: Content of the footnote
validate_location: Whether to validate placement restrictions
auto_repair: Whether to attempt automatic document repair
Returns:
Dict with success status, message, and optional details
"""
filename = ensure_docx_extension(filename)
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return {
"success": False,
"message": f"Cannot modify document: {error_message}",
"details": None
}
# Convert paragraph_index if provided as string
if paragraph_index is not None:
try:
paragraph_index = int(paragraph_index)
except (ValueError, TypeError):
return {
"success": False,
"message": "Invalid parameter: paragraph_index must be an integer",
"details": None
}
# Call robust implementation
success, message, details = add_footnote_robust(
filename=filename,
search_text=search_text,
paragraph_index=paragraph_index,
footnote_text=footnote_text,
validate_location=validate_location,
auto_repair=auto_repair
)
return {
"success": success,
"message": message,
"details": details
}
async def delete_footnote_robust_tool(
filename: str,
footnote_id: Optional[int] = None,
search_text: Optional[str] = None,
clean_orphans: bool = True
) -> Dict[str, Any]:
"""
Delete a footnote with comprehensive cleanup.
Args:
filename: Path to the Word document
footnote_id: ID of footnote to delete
search_text: Text near footnote reference
clean_orphans: Whether to remove orphaned content
Returns:
Dict with success status, message, and optional details
"""
filename = ensure_docx_extension(filename)
# Check if file is writeable
is_writeable, error_message = check_file_writeable(filename)
if not is_writeable:
return {
"success": False,
"message": f"Cannot modify document: {error_message}",
"details": None
}
# Convert footnote_id if provided as string
if footnote_id is not None:
try:
footnote_id = int(footnote_id)
except (ValueError, TypeError):
return {
"success": False,
"message": "Invalid parameter: footnote_id must be an integer",
"details": None
}
# Call robust implementation
success, message, details = delete_footnote_robust(
filename=filename,
footnote_id=footnote_id,
search_text=search_text,
clean_orphans=clean_orphans
)
return {
"success": success,
"message": message,
"details": details
}
async def validate_footnotes_tool(filename: str) -> Dict[str, Any]:
"""
Validate all footnotes in a document.
Provides comprehensive validation report including:
- ID conflicts
- Orphaned content
- Missing styles
- Invalid locations
- Coherence issues
Args:
filename: Path to the Word document
Returns:
Dict with validation status and detailed report
"""
filename = ensure_docx_extension(filename)
if not os.path.exists(filename):
return {
"valid": False,
"message": f"Document {filename} does not exist",
"report": {}
}
# Call validation
is_valid, message, report = validate_document_footnotes(filename)
return {
"valid": is_valid,
"message": message,
"report": report
}
# ============================================================================
# Compatibility wrappers for robust tools (maintain backward compatibility)
# ============================================================================
async def add_footnote_to_document_robust(
filename: str,
paragraph_index: int,
footnote_text: str
) -> str:
"""
Robust version of add_footnote_to_document.
Maintains backward compatibility with existing API.
"""
result = await add_footnote_robust_tool(
filename=filename,
paragraph_index=paragraph_index,
footnote_text=footnote_text
)
return result["message"]
async def add_footnote_after_text_robust(
filename: str,
search_text: str,
footnote_text: str,
output_filename: Optional[str] = None
) -> str:
"""
Robust version of add_footnote_after_text.
Maintains backward compatibility with existing API.
"""
# Handle output filename by copying first if needed
working_file = filename
if output_filename:
import shutil
shutil.copy2(filename, output_filename)
working_file = output_filename
result = await add_footnote_robust_tool(
filename=working_file,
search_text=search_text,
footnote_text=footnote_text
)
return result["message"]
async def add_footnote_before_text_robust(
filename: str,
search_text: str,
footnote_text: str,
output_filename: Optional[str] = None
) -> str:
"""
Robust version of add_footnote_before_text.
Note: Current robust implementation defaults to 'after' position.
"""
# Handle output filename
working_file = filename
if output_filename:
import shutil
shutil.copy2(filename, output_filename)
working_file = output_filename
result = await add_footnote_robust_tool(
filename=working_file,
search_text=search_text,
footnote_text=footnote_text
)
return result["message"]
async def delete_footnote_from_document_robust(
filename: str,
footnote_id: Optional[int] = None,
search_text: Optional[str] = None,
output_filename: Optional[str] = None
) -> str:
"""
Robust version of delete_footnote_from_document.
Maintains backward compatibility with existing API.
"""
# Handle output filename
working_file = filename
if output_filename:
import shutil
shutil.copy2(filename, output_filename)
working_file = output_filename
result = await delete_footnote_robust_tool(
filename=working_file,
footnote_id=footnote_id,
search_text=search_text
)
return result["message"]
```