#
tokens: 44572/50000 30/35 files (page 1/2)
lines: on (toggle) GitHub
raw markdown copy reset
This is page 1 of 2. Use http://codebase.md/gongrzhe/office-word-mcp-server?lines=true&page={x} to view the full context.

# Directory Structure

```
├── __init__.py
├── .gitignore
├── Dockerfile
├── LICENSE
├── mcp-config.json
├── office_word_mcp_server
│   └── __init__.py
├── pyproject.toml
├── README.md
├── RENDER_DEPLOYMENT.md
├── requirements.txt
├── setup_mcp.py
├── smithery.yaml
├── test_formatting.py
├── tests
│   └── test_convert_to_pdf.py
├── uv.lock
├── word_document_server
│   ├── __init__.py
│   ├── core
│   │   ├── __init__.py
│   │   ├── comments.py
│   │   ├── footnotes.py
│   │   ├── protection.py
│   │   ├── styles.py
│   │   ├── tables.py
│   │   └── unprotect.py
│   ├── main.py
│   ├── tools
│   │   ├── __init__.py
│   │   ├── comment_tools.py
│   │   ├── content_tools.py
│   │   ├── document_tools.py
│   │   ├── extended_document_tools.py
│   │   ├── footnote_tools.py
│   │   ├── format_tools.py
│   │   └── protection_tools.py
│   └── utils
│       ├── __init__.py
│       ├── document_utils.py
│       ├── extended_document_utils.py
│       └── file_utils.py
└── word_mcp_server.py
```

# Files

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
 1 | # Project files
 2 | .idea
 3 | .DS_Store
 4 | 
 5 | # Python-generated files
 6 | __pycache__/
 7 | *.py[oc]
 8 | build/
 9 | dist/
10 | wheels/
11 | *.egg-info
12 | 
13 | # Virtual environments
14 | .venv
15 | .env.example
16 | .idea
17 | 
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Office-Word-MCP-Server
  2 | 
  3 | [![smithery badge](https://smithery.ai/badge/@GongRzhe/Office-Word-MCP-Server)](https://smithery.ai/server/@GongRzhe/Office-Word-MCP-Server)
  4 | 
  5 | A Model Context Protocol (MCP) server for creating, reading, and manipulating Microsoft Word documents. This server enables AI assistants to work with Word documents through a standardized interface, providing rich document editing capabilities.
  6 | 
  7 | <a href="https://glama.ai/mcp/servers/@GongRzhe/Office-Word-MCP-Server">
  8 |   <img width="380" height="200" src="https://glama.ai/mcp/servers/@GongRzhe/Office-Word-MCP-Server/badge" alt="Office Word Server MCP server" />
  9 | </a>
 10 | 
 11 | ![](https://badge.mcpx.dev?type=server "MCP Server")
 12 | 
 13 | ## Overview
 14 | 
 15 | Office-Word-MCP-Server implements the [Model Context Protocol](https://modelcontextprotocol.io/) to expose Word document operations as tools and resources. It serves as a bridge between AI assistants and Microsoft Word documents, allowing for document creation, content addition, formatting, and analysis.
 16 | 
 17 | The server features a modular architecture that separates concerns into core functionality, tools, and utilities, making it highly maintainable and extensible for future enhancements.
 18 | 
 19 | ### Example
 20 | 
 21 | #### Pormpt
 22 | 
 23 | ![image](https://github.com/user-attachments/assets/f49b0bcc-88b2-4509-bf50-995b9a40038c)
 24 | 
 25 | #### Output
 26 | 
 27 | ![image](https://github.com/user-attachments/assets/ff64385d-3822-4160-8cdf-f8a484ccc01a)
 28 | 
 29 | ## Features
 30 | 
 31 | ### Document Management
 32 | 
 33 | - Create new Word documents with metadata
 34 | - Extract text and analyze document structure
 35 | - View document properties and statistics
 36 | - List available documents in a directory
 37 | - Create copies of existing documents
 38 | - Merge multiple documents into a single document
 39 | - Convert Word documents to PDF format
 40 | 
 41 | ### Content Creation
 42 | 
 43 | - Add headings with different levels and direct formatting (font, size, bold, italic, borders)
 44 | - Insert paragraphs with optional styling and direct formatting (font, size, bold, italic, color)
 45 | - Create tables with custom data
 46 | - Add images with proportional scaling
 47 | - Insert page breaks
 48 | - Insert bulleted and numbered lists with proper XML formatting
 49 | - Add footnotes and endnotes to documents
 50 | - Convert footnotes to endnotes
 51 | - Customize footnote and endnote styling
 52 | - Create professional table layouts for technical documentation
 53 | - Design callout boxes and formatted content for instructional materials
 54 | - Build structured data tables for business reports with consistent styling
 55 | - Insert content relative to existing text or paragraph indices
 56 | 
 57 | ### Rich Text Formatting
 58 | 
 59 | - Format specific text sections (bold, italic, underline)
 60 | - Change text color and font properties
 61 | - Apply custom styles to text elements
 62 | - Search and replace text throughout documents
 63 | - Individual cell text formatting within tables
 64 | - Multiple formatting combinations for enhanced visual appeal
 65 | - Font customization with family and size control
 66 | - Direct formatting during content creation (paragraphs and headings)
 67 | - Reduce function calls by combining content creation with formatting
 68 | - Add section header borders for visual separation
 69 | 
 70 | ### Table Formatting
 71 | 
 72 | - Format tables with borders and styles
 73 | - Create header rows with distinct formatting
 74 | - Apply cell shading and custom borders
 75 | - Structure tables for better readability
 76 | - Individual cell background shading with color support
 77 | - Alternating row colors for improved readability
 78 | - Enhanced header row highlighting with custom colors
 79 | - Cell text formatting with bold, italic, underline, color, font size, and font family
 80 | - Comprehensive color support with named colors and hex color codes
 81 | - Cell padding management with independent control of all sides
 82 | - Cell alignment (horizontal and vertical positioning)
 83 | - Cell merging (horizontal, vertical, and rectangular areas)
 84 | - Column width management with multiple units (points, percentage, auto-fit)
 85 | - Auto-fit capabilities for dynamic column sizing
 86 | - Professional callout table support with icon cells and styled content
 87 | 
 88 | ### Advanced Document Manipulation
 89 | 
 90 | - Delete paragraphs
 91 | - Insert content relative to specific text or paragraph indices
 92 | - Insert bulleted and numbered lists with proper XML numbering structure
 93 | - Insert headers and paragraphs before or after target locations
 94 | - Create custom document styles
 95 | - Apply consistent formatting throughout documents
 96 | - Format specific ranges of text with detailed control
 97 | - Flexible padding units with support for points and percentage-based measurements
 98 | - Clear, readable table presentation with proper alignment and spacing
 99 | 
100 | ### Document Protection
101 | 
102 | - Add password protection to documents
103 | - Implement restricted editing with editable sections
104 | - Add digital signatures to documents
105 | - Verify document authenticity and integrity
106 | 
107 | ### Comment Extraction
108 | 
109 | - Extract all comments from a document
110 | - Filter comments by author
111 | - Get comments for specific paragraphs
112 | - Access comment metadata (author, date, text)
113 | 
114 | ## Installation
115 | 
116 | ### Installing via Smithery
117 | 
118 | To install Office Word Document Server for Claude Desktop automatically via [Smithery](https://smithery.ai/server/@GongRzhe/Office-Word-MCP-Server):
119 | 
120 | ```bash
121 | npx -y @smithery/cli install @GongRzhe/Office-Word-MCP-Server --client claude
122 | ```
123 | 
124 | ### Prerequisites
125 | 
126 | - Python 3.8 or higher
127 | - pip package manager
128 | 
129 | ### Basic Installation
130 | 
131 | ```bash
132 | # Clone the repository
133 | git clone https://github.com/GongRzhe/Office-Word-MCP-Server.git
134 | cd Office-Word-MCP-Server
135 | 
136 | # Install dependencies
137 | pip install -r requirements.txt
138 | ```
139 | 
140 | ### Using the Setup Script
141 | 
142 | Alternatively, you can use the provided setup script which handles:
143 | 
144 | - Checking prerequisites
145 | - Setting up a virtual environment
146 | - Installing dependencies
147 | - Generating MCP configuration
148 | 
149 | ```bash
150 | python setup_mcp.py
151 | ```
152 | 
153 | ## Usage with Claude for Desktop
154 | 
155 | ### Configuration
156 | 
157 | #### Method 1: After Local Installation
158 | 
159 | 1. After installation, add the server to your Claude for Desktop configuration file:
160 | 
161 | ```json
162 | {
163 |   "mcpServers": {
164 |     "word-document-server": {
165 |       "command": "python",
166 |       "args": ["/path/to/word_mcp_server.py"]
167 |     }
168 |   }
169 | }
170 | ```
171 | 
172 | #### Method 2: Without Installation (Using uvx)
173 | 
174 | 1. You can also configure Claude for Desktop to use the server without local installation by using the uvx package manager:
175 | 
176 | ```json
177 | {
178 |   "mcpServers": {
179 |     "word-document-server": {
180 |       "command": "uvx",
181 |       "args": ["--from", "office-word-mcp-server", "word_mcp_server"]
182 |     }
183 |   }
184 | }
185 | ```
186 | 
187 | 2. Configuration file locations:
188 | 
189 |    - macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
190 |    - Windows: `%APPDATA%\Claude\claude_desktop_config.json`
191 | 
192 | 3. Restart Claude for Desktop to load the configuration.
193 | 
194 | ### Example Operations
195 | 
196 | Once configured, you can ask Claude to perform operations like:
197 | 
198 | - "Create a new document called 'report.docx' with a title page"
199 | - "Add a heading and three paragraphs to my document"
200 | - "Add my name in Helvetica 36pt bold at the top of the document"
201 | - "Add a section heading 'Summary' in Helvetica 14pt bold with a bottom border"
202 | - "Add a paragraph in Times New Roman 14pt with italic blue text"
203 | - "Insert a bulleted list after the paragraph containing 'Introduction'"
204 | - "Insert a numbered list with items: 'First step', 'Second step', 'Third step'"
205 | - "Add bullet points after the 'Summary' heading"
206 | - "Insert a 4x4 table with sales data"
207 | - "Format the word 'important' in paragraph 2 to be bold and red"
208 | - "Search and replace all instances of 'old term' with 'new term'"
209 | - "Create a custom style for section headings"
210 | - "Apply formatting to the table in my document"
211 | - "Extract all comments from my document"
212 | - "Show me all comments by John Doe"
213 | - "Get comments for paragraph 3"
214 | - "Make the text in table cell (1,2) bold and blue with 14pt font"
215 | - "Add 10 points of padding to all sides of the header cells"
216 | - "Create a callout table with a blue checkmark icon and white text"
217 | - "Set the first column width to 50 points and auto-fit the remaining columns"
218 | - "Apply alternating row colors to make the table more readable"
219 | 
220 | 
221 | ## API Reference
222 | 
223 | ### Document Creation and Properties
224 | 
225 | ```python
226 | create_document(filename, title=None, author=None)
227 | get_document_info(filename)
228 | get_document_text(filename)
229 | get_document_outline(filename)
230 | list_available_documents(directory=".")
231 | copy_document(source_filename, destination_filename=None)
232 | convert_to_pdf(filename, output_filename=None)
233 | ```
234 | 
235 | ### Content Addition
236 | 
237 | ```python
238 | add_heading(filename, text, level=1, font_name=None, font_size=None,
239 |             bold=None, italic=None, border_bottom=False)
240 | add_paragraph(filename, text, style=None, font_name=None, font_size=None,
241 |               bold=None, italic=None, color=None)
242 | add_table(filename, rows, cols, data=None)
243 | add_picture(filename, image_path, width=None)
244 | add_page_break(filename)
245 | ```
246 | 
247 | ### Advanced Content Manipulation
248 | 
249 | ```python
250 | # Insert content relative to existing text or paragraph index
251 | insert_header_near_text(filename, target_text=None, header_title=None,
252 |                        position='after', header_style='Heading 1',
253 |                        target_paragraph_index=None)
254 | 
255 | insert_line_or_paragraph_near_text(filename, target_text=None, line_text=None,
256 |                                    position='after', line_style=None,
257 |                                    target_paragraph_index=None)
258 | 
259 | # Insert bulleted or numbered lists with proper XML formatting
260 | insert_numbered_list_near_text(filename, target_text=None, list_items=None,
261 |                               position='after', target_paragraph_index=None,
262 |                               bullet_type='bullet')
263 | # bullet_type options:
264 | #   'bullet' - Creates bulleted list with bullets (•)
265 | #   'number' - Creates numbered list (1, 2, 3, ...)
266 | ```
267 | 
268 | ### Content Extraction
269 | 
270 | ```python
271 | get_document_text(filename)
272 | get_paragraph_text_from_document(filename, paragraph_index)
273 | find_text_in_document(filename, text_to_find, match_case=True, whole_word=False)
274 | ```
275 | 
276 | ### Text Formatting
277 | 
278 | ```python
279 | format_text(filename, paragraph_index, start_pos, end_pos, bold=None,
280 |             italic=None, underline=None, color=None, font_size=None, font_name=None)
281 | search_and_replace(filename, find_text, replace_text)
282 | delete_paragraph(filename, paragraph_index)
283 | create_custom_style(filename, style_name, bold=None, italic=None,
284 |                     font_size=None, font_name=None, color=None, base_style=None)
285 | ```
286 | 
287 | ### Table Formatting
288 | 
289 | ```python
290 | format_table(filename, table_index, has_header_row=None,
291 |              border_style=None, shading=None)
292 | set_table_cell_shading(filename, table_index, row_index, col_index, 
293 |                       fill_color, pattern="clear")
294 | apply_table_alternating_rows(filename, table_index, 
295 |                             color1="FFFFFF", color2="F2F2F2")
296 | highlight_table_header(filename, table_index, 
297 |                       header_color="4472C4", text_color="FFFFFF")
298 | 
299 | # Cell merging tools
300 | merge_table_cells(filename, table_index, start_row, start_col, end_row, end_col)
301 | merge_table_cells_horizontal(filename, table_index, row_index, start_col, end_col)
302 | merge_table_cells_vertical(filename, table_index, col_index, start_row, end_row)
303 | 
304 | # Cell alignment tools
305 | set_table_cell_alignment(filename, table_index, row_index, col_index,
306 |                         horizontal="left", vertical="top")
307 | set_table_alignment_all(filename, table_index, 
308 |                        horizontal="left", vertical="top")
309 | 
310 | # Cell text formatting tools
311 | format_table_cell_text(filename, table_index, row_index, col_index,
312 |                       text_content=None, bold=None, italic=None, underline=None,
313 |                       color=None, font_size=None, font_name=None)
314 | 
315 | # Cell padding tools
316 | set_table_cell_padding(filename, table_index, row_index, col_index,
317 |                       top=None, bottom=None, left=None, right=None, unit="points")
318 | 
319 | # Column width management
320 | set_table_column_width(filename, table_index, col_index, width, width_type="points")
321 | set_table_column_widths(filename, table_index, widths, width_type="points")
322 | set_table_width(filename, table_index, width, width_type="points")
323 | auto_fit_table_columns(filename, table_index)
324 | ```
325 | 
326 | ### Comment Extraction
327 | 
328 | ```python
329 | get_all_comments(filename)
330 | get_comments_by_author(filename, author)
331 | get_comments_for_paragraph(filename, paragraph_index)
332 | ```
333 | 
334 | ## Troubleshooting
335 | 
336 | ### Common Issues
337 | 
338 | 1. **Missing Styles**
339 | 
340 |    - Some documents may lack required styles for heading and table operations
341 |    - The server will attempt to create missing styles or use direct formatting
342 |    - For best results, use templates with standard Word styles
343 | 
344 | 2. **Permission Issues**
345 | 
346 |    - Ensure the server has permission to read/write to the document paths
347 |    - Use the `copy_document` function to create editable copies of locked documents
348 |    - Check file ownership and permissions if operations fail
349 | 
350 | 3. **Image Insertion Problems**
351 |    - Use absolute paths for image files
352 |    - Verify image format compatibility (JPEG, PNG recommended)
353 |    - Check image file size and permissions
354 | 
355 | 4. **Table Formatting Issues**
356 | 
357 |    - **Cell index errors**: Ensure row and column indices are within table bounds (0-based indexing)
358 |    - **Color format problems**: Use hex colors without '#' prefix (e.g., "FF0000" for red) or standard color names
359 |    - **Padding unit confusion**: Specify "points" or "percent" explicitly when setting cell padding
360 |    - **Column width conflicts**: Auto-fit may override manual column width settings
361 |    - **Text formatting persistence**: Apply cell text formatting after setting cell content for best results
362 | 
363 | ### Debugging
364 | 
365 | Enable detailed logging by setting the environment variable:
366 | 
367 | ```bash
368 | export MCP_DEBUG=1  # Linux/macOS
369 | set MCP_DEBUG=1     # Windows
370 | ```
371 | 
372 | ## Contributing
373 | 
374 | Contributions are welcome! Please feel free to submit a Pull Request.
375 | 
376 | 1. Fork the repository
377 | 2. Create your feature branch (`git checkout -b feature/amazing-feature`)
378 | 3. Commit your changes (`git commit -m 'Add some amazing feature'`)
379 | 4. Push to the branch (`git push origin feature/amazing-feature`)
380 | 5. Open a Pull Request
381 | 
382 | ## License
383 | 
384 | This project is licensed under the MIT License - see the LICENSE file for details.
385 | 
386 | ## Acknowledgments
387 | 
388 | - [Model Context Protocol](https://modelcontextprotocol.io/) for the protocol specification
389 | - [python-docx](https://python-docx.readthedocs.io/) for Word document manipulation
390 | - [FastMCP](https://github.com/modelcontextprotocol/python-sdk) for the Python MCP implementation
391 | 
392 | ---
393 | 
394 | _Note: This server interacts with document files on your system. Always verify that requested operations are appropriate before confirming them in Claude for Desktop or other MCP clients._
395 | 
```

--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------

```
1 | fastmcp
2 | python-docx
3 | msoffcrypto-tool
4 | docx2pdf
5 | python-dotenv
```

--------------------------------------------------------------------------------
/office_word_mcp_server/__init__.py:
--------------------------------------------------------------------------------

```python
1 | from word_document_server.main import run_server
2 | 
3 | __all__ = ["run_server"]
4 | 
```

--------------------------------------------------------------------------------
/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """Office Word MCP Server package entry point."""
2 | from word_document_server.main import run_server
3 | 
4 | __all__ = ["run_server"]
5 | 
```

--------------------------------------------------------------------------------
/word_mcp_server.py:
--------------------------------------------------------------------------------

```python
 1 | #!/usr/bin/env python3
 2 | """
 3 | Run script for the Word Document Server.
 4 | 
 5 | This script provides a simple way to start the Word Document Server.
 6 | """
 7 | 
 8 | from word_document_server.main import run_server
 9 | 
10 | if __name__ == "__main__":
11 |     run_server()
12 | 
```

--------------------------------------------------------------------------------
/mcp-config.json:
--------------------------------------------------------------------------------

```json
 1 | {
 2 |   "mcpServers": {
 3 |     "word-document-server": {
 4 |       "command": "/Users/gongzhe/GitRepos/Office-Word-MCP-Server/.venv/bin/python",
 5 |       "args": [
 6 |         "/Users/gongzhe/GitRepos/Office-Word-MCP-Server/word_mcp_server.py"
 7 |       ],
 8 |       "env": {
 9 |         "PYTHONPATH": "/Users/gongzhe/GitRepos/Office-Word-MCP-Server",
10 |         "MCP_TRANSPORT": "stdio"
11 |       }
12 |     }
13 |   }
14 | }
```

--------------------------------------------------------------------------------
/word_document_server/utils/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """
2 | Utility functions for the Word Document Server.
3 | 
4 | This package contains utility modules for file operations and document handling.
5 | """
6 | 
7 | from word_document_server.utils.file_utils import check_file_writeable, create_document_copy, ensure_docx_extension
8 | from word_document_server.utils.document_utils import get_document_properties, extract_document_text, get_document_structure, find_paragraph_by_text, find_and_replace_text
9 | 
```

--------------------------------------------------------------------------------
/smithery.yaml:
--------------------------------------------------------------------------------

```yaml
 1 | # Smithery configuration file: https://smithery.ai/docs/build/project-config
 2 | 
 3 | startCommand:
 4 |   type: stdio
 5 |   configSchema:
 6 |     # JSON Schema defining the configuration options for the MCP.
 7 |     type: object
 8 |     description: No configuration options required
 9 |   commandFunction:
10 |     # A JS function that produces the CLI command based on the given config to start the MCP on stdio.
11 |     |-
12 |     (config) => ({command:'word_mcp_server', args:[]})
13 |   exampleConfig: {}
14 | 
```

--------------------------------------------------------------------------------
/word_document_server/__init__.py:
--------------------------------------------------------------------------------

```python
 1 | """
 2 | Word Document Server - MCP server for Microsoft Word document manipulation.
 3 | 
 4 | This package provides tools for creating, reading, and manipulating Microsoft Word 
 5 | documents through the Model Context Protocol (MCP).
 6 | 
 7 | Features:
 8 | - Document creation and management
 9 | - Content addition (headings, paragraphs, tables, images)
10 | - Text and table formatting
11 | - Document protection (password, restricted editing, signatures)
12 | - Footnote and endnote management
13 | """
14 | 
15 | __version__ = "1.0.0"
16 | 
```

--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
 1 | # Generated by https://smithery.ai. See: https://smithery.ai/docs/build/project-config
 2 | # syntax=docker/dockerfile:1
 3 | 
 4 | # Use official Python runtime
 5 | FROM python:3.11-slim
 6 | 
 7 | # Set working directory
 8 | WORKDIR /app
 9 | 
10 | # Install build dependencies
11 | RUN apt-get update \
12 |     && apt-get install -y --no-install-recommends build-essential \
13 |     && rm -rf /var/lib/apt/lists/*
14 | 
15 | # Copy project files
16 | COPY . /app
17 | 
18 | # Install Python dependencies
19 | RUN pip install --no-cache-dir .
20 | 
21 | # Default command
22 | ENTRYPOINT ["word_mcp_server"]
23 | 
```

--------------------------------------------------------------------------------
/word_document_server/core/__init__.py:
--------------------------------------------------------------------------------

```python
 1 | """
 2 | Core functionality for the Word Document Server.
 3 | 
 4 | This package contains the core functionality modules used by the Word Document Server.
 5 | """
 6 | 
 7 | from word_document_server.core.styles import ensure_heading_style, ensure_table_style, create_style
 8 | from word_document_server.core.protection import add_protection_info, verify_document_protection, is_section_editable, create_signature_info, verify_signature
 9 | from word_document_server.core.footnotes import add_footnote, add_endnote, convert_footnotes_to_endnotes, find_footnote_references, get_format_symbols, customize_footnote_formatting
10 | from word_document_server.core.tables import set_cell_border, apply_table_style, copy_table
11 | 
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
 1 | [build-system]
 2 | requires = ["hatchling"]
 3 | build-backend = "hatchling.build"
 4 | 
 5 | [project]
 6 | name = "office-word-mcp-server"
 7 | version = "1.1.10"
 8 | description = "MCP server for manipulating Microsoft Word documents"
 9 | readme = "README.md"
10 | license = {file = "LICENSE"}
11 | authors = [
12 |     {name = "GongRzhe", email = "[email protected]"}
13 | ]
14 | classifiers = [
15 |     "Programming Language :: Python :: 3",
16 |     "License :: OSI Approved :: MIT License",
17 |     "Operating System :: OS Independent",
18 | ]
19 | requires-python = ">=3.11"
20 | dependencies = [
21 |     "python-docx>=1.1.2",
22 |     "fastmcp>=2.8.1",
23 |     "msoffcrypto-tool>=5.4.2",
24 |     "docx2pdf>=0.1.8",
25 |     "pytest>=8.4.2",
26 | ]
27 | 
28 | [project.urls]
29 | "Homepage" = "https://github.com/GongRzhe/Office-Word-MCP-Server.git"
30 | "Bug Tracker" = "https://github.com/GongRzhe/Office-Word-MCP-Server.git/issues"
31 | 
32 | [tool.hatch.build.targets.wheel]
33 | only-include = [
34 |     "word_document_server",
35 |     "office_word_mcp_server",
36 | ]
37 | sources = ["."]
38 | 
39 | [project.scripts]
40 | word_mcp_server = "word_document_server.main:run_server"
41 | 
```

--------------------------------------------------------------------------------
/word_document_server/tools/__init__.py:
--------------------------------------------------------------------------------

```python
 1 | """
 2 | MCP tool implementations for the Word Document Server.
 3 | 
 4 | This package contains the MCP tool implementations that expose functionality
 5 | to clients through the Model Context Protocol.
 6 | """
 7 | 
 8 | # Document tools
 9 | from word_document_server.tools.document_tools import (
10 |     create_document, get_document_info, get_document_text, 
11 |     get_document_outline, list_available_documents, 
12 |     copy_document, merge_documents
13 | )
14 | 
15 | # Content tools
16 | from word_document_server.tools.content_tools import (
17 |     add_heading, add_paragraph, add_table, add_picture,
18 |     add_page_break, add_table_of_contents, delete_paragraph,
19 |     search_and_replace
20 | )
21 | 
22 | # Format tools
23 | from word_document_server.tools.format_tools import (
24 |     format_text, create_custom_style, format_table
25 | )
26 | 
27 | # Protection tools
28 | from word_document_server.tools.protection_tools import (
29 |     protect_document, add_restricted_editing,
30 |     add_digital_signature, verify_document
31 | )
32 | 
33 | # Footnote tools
34 | from word_document_server.tools.footnote_tools import (
35 |     add_footnote_to_document, add_endnote_to_document,
36 |     convert_footnotes_to_endnotes_in_document, customize_footnote_style
37 | )
38 | 
39 | # Comment tools
40 | from word_document_server.tools.comment_tools import (
41 |     get_all_comments, get_comments_by_author, get_comments_for_paragraph
42 | )
43 | 
```

--------------------------------------------------------------------------------
/RENDER_DEPLOYMENT.md:
--------------------------------------------------------------------------------

```markdown
 1 | # Render Deployment Guide
 2 | 
 3 | This document explains how to deploy the Office Word MCP Server on Render.
 4 | 
 5 | ## Required Environment Variables
 6 | 
 7 | Set the following environment variables in your Render service:
 8 | 
 9 | ### `MCP_TRANSPORT`
10 | - **Value**: `sse`
11 | - **Description**: Sets the transport type to Server-Sent Events (SSE) for HTTP communication
12 | - **Required**: Yes (for Render deployment)
13 | 
14 | ### `MCP_HOST`
15 | - **Value**: `0.0.0.0`
16 | - **Description**: Binds the server to all network interfaces
17 | - **Required**: No (defaults to 0.0.0.0)
18 | 
19 | ### `FASTMCP_LOG_LEVEL`
20 | - **Value**: `INFO`
21 | - **Description**: Sets the logging level for FastMCP
22 | - **Required**: No (defaults to INFO)
23 | 
24 | ## How to Set Environment Variables
25 | 
26 | 1. Go to your Render dashboard: https://dashboard.render.com
27 | 2. Navigate to your service: `Office-Word-MCP-Server`
28 | 3. Click on "Environment" in the left sidebar
29 | 4. Add the environment variable:
30 |    - Key: `MCP_TRANSPORT`
31 |    - Value: `sse`
32 | 5. Click "Save Changes"
33 | 
34 | ## Deployment
35 | 
36 | After setting the environment variables:
37 | 1. Render will automatically redeploy your service
38 | 2. The server will start with SSE transport on the port provided by Render
39 | 3. Access your server at: `https://office-word-mcp-server-bzlp.onrender.com/sse`
40 | 
41 | ## Health Check Endpoint
42 | 
43 | The FastMCP server with SSE transport automatically provides a health check endpoint at:
44 | - `https://your-service.onrender.com/health`
45 | 
46 | ## Troubleshooting
47 | 
48 | ### Server exits with status 1
49 | - **Cause**: Server is running in STDIO mode instead of SSE
50 | - **Fix**: Ensure `MCP_TRANSPORT=sse` is set in environment variables
51 | 
52 | ### Port binding errors
53 | - **Cause**: Server not using Render's PORT environment variable
54 | - **Fix**: This has been fixed in the latest version of main.py
55 | 
56 | ### Cannot connect to server
57 | - **Cause**: Health checks failing
58 | - **Fix**: Ensure SSE transport is enabled and server is listening on 0.0.0.0
59 | 
60 | 
```

--------------------------------------------------------------------------------
/word_document_server/utils/file_utils.py:
--------------------------------------------------------------------------------

```python
 1 | """
 2 | File utility functions for Word Document Server.
 3 | """
 4 | import os
 5 | from typing import Tuple, Optional
 6 | import shutil
 7 | 
 8 | 
 9 | def check_file_writeable(filepath: str) -> Tuple[bool, str]:
10 |     """
11 |     Check if a file can be written to.
12 |     
13 |     Args:
14 |         filepath: Path to the file
15 |         
16 |     Returns:
17 |         Tuple of (is_writeable, error_message)
18 |     """
19 |     # If file doesn't exist, check if directory is writeable
20 |     if not os.path.exists(filepath):
21 |         directory = os.path.dirname(filepath)
22 |         # If no directory is specified (empty string), use current directory
23 |         if directory == '':
24 |             directory = '.'
25 |         if not os.path.exists(directory):
26 |             return False, f"Directory {directory} does not exist"
27 |         if not os.access(directory, os.W_OK):
28 |             return False, f"Directory {directory} is not writeable"
29 |         return True, ""
30 |     
31 |     # If file exists, check if it's writeable
32 |     if not os.access(filepath, os.W_OK):
33 |         return False, f"File {filepath} is not writeable (permission denied)"
34 |     
35 |     # Try to open the file for writing to see if it's locked
36 |     try:
37 |         with open(filepath, 'a'):
38 |             pass
39 |         return True, ""
40 |     except IOError as e:
41 |         return False, f"File {filepath} is not writeable: {str(e)}"
42 |     except Exception as e:
43 |         return False, f"Unknown error checking file permissions: {str(e)}"
44 | 
45 | 
46 | def create_document_copy(source_path: str, dest_path: Optional[str] = None) -> Tuple[bool, str, Optional[str]]:
47 |     """
48 |     Create a copy of a document.
49 |     
50 |     Args:
51 |         source_path: Path to the source document
52 |         dest_path: Optional path for the new document. If not provided, will use source_path + '_copy.docx'
53 |         
54 |     Returns:
55 |         Tuple of (success, message, new_filepath)
56 |     """
57 |     if not os.path.exists(source_path):
58 |         return False, f"Source document {source_path} does not exist", None
59 |     
60 |     if not dest_path:
61 |         # Generate a new filename if not provided
62 |         base, ext = os.path.splitext(source_path)
63 |         dest_path = f"{base}_copy{ext}"
64 |     
65 |     try:
66 |         # Simple file copy
67 |         shutil.copy2(source_path, dest_path)
68 |         return True, f"Document copied to {dest_path}", dest_path
69 |     except Exception as e:
70 |         return False, f"Failed to copy document: {str(e)}", None
71 | 
72 | 
73 | def ensure_docx_extension(filename: str) -> str:
74 |     """
75 |     Ensure filename has .docx extension.
76 |     
77 |     Args:
78 |         filename: The filename to check
79 |         
80 |     Returns:
81 |         Filename with .docx extension
82 |     """
83 |     if not filename.endswith('.docx'):
84 |         return filename + '.docx'
85 |     return filename
86 | 
```

--------------------------------------------------------------------------------
/word_document_server/core/unprotect.py:
--------------------------------------------------------------------------------

```python
 1 | """
 2 | Unprotect document functionality for the Word Document Server.
 3 | 
 4 | This module handles removing document protection.
 5 | """
 6 | import os
 7 | import json
 8 | import hashlib
 9 | import tempfile
10 | import shutil
11 | from typing import Tuple, Optional
12 | 
13 | def remove_protection_info(filename: str, password: Optional[str] = None) -> Tuple[bool, str]:
14 |     """
15 |     Remove protection information from a document and decrypt it if necessary.
16 |     
17 |     Args:
18 |         filename: Path to the Word document
19 |         password: Password to verify before removing protection
20 |         
21 |     Returns:
22 |         Tuple of (success, message)
23 |     """
24 |     base_path, _ = os.path.splitext(filename)
25 |     metadata_path = f"{base_path}.protection"
26 |     
27 |     # Check if protection metadata exists
28 |     if not os.path.exists(metadata_path):
29 |         return False, "Document is not protected"
30 |     
31 |     try:
32 |         # Load protection data
33 |         with open(metadata_path, 'r') as f:
34 |             protection_data = json.load(f)
35 |         
36 |         # Verify password if provided and required
37 |         if password and protection_data.get("password_hash"):
38 |             password_hash = hashlib.sha256(password.encode()).hexdigest()
39 |             if password_hash != protection_data.get("password_hash"):
40 |                 return False, "Incorrect password"
41 |         
42 |         # Handle true encryption if it was applied
43 |         if protection_data.get("true_encryption") and password:
44 |             try:
45 |                 import msoffcrypto
46 |                 
47 |                 # Create a temporary file for the decrypted output
48 |                 temp_fd, temp_path = tempfile.mkstemp(suffix='.docx')
49 |                 os.close(temp_fd)
50 |                 
51 |                 # Open the encrypted document
52 |                 with open(filename, 'rb') as f:
53 |                     office_file = msoffcrypto.OfficeFile(f)
54 |                     
55 |                     # Decrypt with provided password
56 |                     try:
57 |                         office_file.load_key(password=password)
58 |                         
59 |                         # Write the decrypted file to the temp path
60 |                         with open(temp_path, 'wb') as out_file:
61 |                             office_file.decrypt(out_file)
62 |                         
63 |                         # Replace encrypted file with decrypted version
64 |                         shutil.move(temp_path, filename)
65 |                     except Exception as decrypt_error:
66 |                         if os.path.exists(temp_path):
67 |                             os.unlink(temp_path)
68 |                         return False, f"Failed to decrypt document: {str(decrypt_error)}"
69 |             except ImportError:
70 |                 return False, "Missing msoffcrypto package required for encryption/decryption"
71 |             except Exception as e:
72 |                 return False, f"Error decrypting document: {str(e)}"
73 |         
74 |         # Remove the protection metadata file
75 |         os.remove(metadata_path)
76 |         return True, "Protection removed successfully"
77 |     except Exception as e:
78 |         return False, f"Error removing protection: {str(e)}"
79 | 
```

--------------------------------------------------------------------------------
/test_formatting.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Test script for add_paragraph and add_heading formatting parameters.
  3 | """
  4 | import asyncio
  5 | from docx import Document
  6 | from word_document_server.tools.content_tools import add_paragraph, add_heading
  7 | from word_document_server.tools.document_tools import create_document
  8 | 
  9 | 
 10 | async def test_formatting():
 11 |     """Test the new formatting parameters."""
 12 |     test_doc = 'test_formatting.docx'
 13 | 
 14 |     # Create test document
 15 |     print("Creating test document...")
 16 |     await create_document(test_doc, title="Formatting Test", author="Test Script")
 17 | 
 18 |     # Test 1: Name with large font
 19 |     print("Test 1: Adding name with large Helvetica 36pt bold...")
 20 |     result = await add_paragraph(
 21 |         test_doc,
 22 |         "JAMES MEHORTER",
 23 |         font_name="Helvetica",
 24 |         font_size=36,
 25 |         bold=True
 26 |     )
 27 |     print(f"  Result: {result}")
 28 | 
 29 |     # Test 2: Title line
 30 |     print("Test 2: Adding title with Helvetica 14pt...")
 31 |     result = await add_paragraph(
 32 |         test_doc,
 33 |         "Principal Software Engineer | Technical Team Lead",
 34 |         font_name="Helvetica",
 35 |         font_size=14
 36 |     )
 37 |     print(f"  Result: {result}")
 38 | 
 39 |     # Test 3: Section header with border
 40 |     print("Test 3: Adding section header with border...")
 41 |     result = await add_heading(
 42 |         test_doc,
 43 |         "PROFESSIONAL SUMMARY",
 44 |         level=2,
 45 |         font_name="Helvetica",
 46 |         font_size=14,
 47 |         bold=True,
 48 |         border_bottom=True
 49 |     )
 50 |     print(f"  Result: {result}")
 51 | 
 52 |     # Test 4: Body text in Times New Roman
 53 |     print("Test 4: Adding body text in Times New Roman 14pt...")
 54 |     result = await add_paragraph(
 55 |         test_doc,
 56 |         "This is body text that should be in Times New Roman at 14pt. "
 57 |         "It demonstrates the ability to apply different fonts to different paragraphs.",
 58 |         font_name="Times New Roman",
 59 |         font_size=14
 60 |     )
 61 |     print(f"  Result: {result}")
 62 | 
 63 |     # Test 5: Another section header
 64 |     print("Test 5: Adding another section header with border...")
 65 |     result = await add_heading(
 66 |         test_doc,
 67 |         "SKILLS",
 68 |         level=2,
 69 |         font_name="Helvetica",
 70 |         font_size=14,
 71 |         bold=True,
 72 |         border_bottom=True
 73 |     )
 74 |     print(f"  Result: {result}")
 75 | 
 76 |     # Test 6: Italic text with color
 77 |     print("Test 6: Adding italic text with color...")
 78 |     result = await add_paragraph(
 79 |         test_doc,
 80 |         "This text is italic and colored blue.",
 81 |         font_name="Arial",
 82 |         font_size=12,
 83 |         italic=True,
 84 |         color="0000FF"
 85 |     )
 86 |     print(f"  Result: {result}")
 87 | 
 88 |     print(f"\n✅ Test document created: {test_doc}")
 89 | 
 90 |     # Verify formatting
 91 |     print("\nVerifying formatting...")
 92 |     verify_doc = Document(test_doc)
 93 |     for i, para in enumerate(verify_doc.paragraphs):
 94 |         if para.runs:
 95 |             run = para.runs[0]
 96 |             text_preview = para.text[:50] + "..." if len(para.text) > 50 else para.text
 97 |             print(f"\nParagraph {i}: {text_preview}")
 98 |             print(f"  Font: {run.font.name}")
 99 |             print(f"  Size: {run.font.size}")
100 |             print(f"  Bold: {run.font.bold}")
101 |             print(f"  Italic: {run.font.italic}")
102 | 
103 |     print("\n✅ All tests completed successfully!")
104 |     print(f"Open {test_doc} in Word to verify the formatting visually.")
105 | 
106 | 
107 | if __name__ == "__main__":
108 |     asyncio.run(test_formatting())
109 | 
```

--------------------------------------------------------------------------------
/tests/test_convert_to_pdf.py:
--------------------------------------------------------------------------------

```python
 1 | import asyncio
 2 | from pathlib import Path
 3 | 
 4 | import pytest
 5 | from docx import Document
 6 | 
 7 | # Target for testing: convert_to_pdf (async function)
 8 | from word_document_server.tools.extended_document_tools import convert_to_pdf
 9 | 
10 | 
11 | def _make_sample_docx(path: Path) -> None:
12 |     """Generates a simple .docx file in a temporary directory."""
13 |     doc = Document()
14 |     doc.add_heading("Conversion Test Document", level=1)
15 |     doc.add_paragraph("This is a test paragraph for PDF conversion. Contains ASCII too.")
16 |     doc.add_paragraph("Second paragraph: Contains special characters and spaces to cover path/content edge cases.")
17 |     doc.save(path)
18 | 
19 | 
20 | def test_convert_to_pdf_with_temp_docx(tmp_path: Path):
21 |     """
22 |     End-to-end test: Create a temporary .docx -> call convert_to_pdf -> validate the PDF output.
23 | 
24 |     Notes:
25 |     - On Linux/macOS, it first tries LibreOffice (soffice/libreoffice),
26 |       and falls back to docx2pdf on failure (requires Microsoft Word).
27 |     - If these tools are missing or the command is unavailable, the test is skipped with a reason.
28 |     """
29 |     # 1) Generate a docx file with spaces in its name in the temp directory
30 |     src_doc = tmp_path / "sample document with spaces.docx"
31 |     _make_sample_docx(src_doc)
32 | 
33 |     # 2) Define the output PDF path (also in the temp directory)
34 |     out_pdf = tmp_path / "converted output.pdf"
35 | 
36 |     # 3) Run the asynchronous function under test
37 |     result_msg = asyncio.run(convert_to_pdf(str(src_doc), output_filename=str(out_pdf)))
38 | 
39 |     # 4) Success condition: the return message contains success keywords, or the target PDF exists
40 |     success_keywords = ["successfully converted", "converted to PDF"]
41 |     success = any(k.lower() in result_msg.lower() for k in success_keywords) or out_pdf.exists()
42 | 
43 |     if not success:
44 |         # When LibreOffice or Microsoft Word is not installed, the tool returns a hint.
45 |         # In this case, skip the test instead of failing.
46 |         pytest.skip(f"PDF conversion tool unavailable or conversion failed: {result_msg}")
47 | 
48 |     # 5) Assert: The PDF file was generated and is not empty
49 |     # Some environments (especially docx2pdf) might ignore the exact output filename
50 |     # and just generate a PDF with the same name as the source in the output or source directory,
51 |     # so we check multiple possible locations.
52 |     candidates = [
53 |         out_pdf,
54 |         # Common: A PDF with the same name as the source file in the output directory
55 |         out_pdf.parent / f"{src_doc.stem}.pdf",
56 |         # Fallback: A PDF in the same directory as the source file
57 |         src_doc.with_suffix(".pdf"),
58 |     ]
59 | 
60 |     # If none of the above paths exist, search for any newly generated PDF in the temp directory
61 |     found = None
62 |     for p in candidates:
63 |         if p.exists():
64 |             found = p
65 |             break
66 |     if not found:
67 |         pdfs = sorted(tmp_path.glob("*.pdf"), key=lambda p: p.stat().st_mtime, reverse=True)
68 |         if pdfs:
69 |             found = pdfs[0]
70 | 
71 |     if not found:
72 |         # If the tool returns success but the output can't be found,
73 |         # treat it as an environment/tooling difference and skip instead of failing.
74 |         pytest.skip(f"Could not find the generated PDF. Function output: {result_msg}")
75 | 
76 |     assert found.exists(), f"Generated PDF not found: {found}, function output: {result_msg}"
77 |     assert found.stat().st_size > 0, f"The generated PDF file is empty: {found}"
78 | 
79 | 
80 | if __name__ == "__main__":
81 |     # Allow running this file directly for quick verification:
82 |     #   python tests/test_convert_to_pdf.py
83 |     import sys
84 |     sys.exit(pytest.main([__file__, "-q"]))
85 | 
```

--------------------------------------------------------------------------------
/word_document_server/core/styles.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Style-related functions for Word Document Server.
  3 | """
  4 | from docx.shared import Pt
  5 | from docx.enum.style import WD_STYLE_TYPE
  6 | 
  7 | 
  8 | def ensure_heading_style(doc):
  9 |     """
 10 |     Ensure Heading styles exist in the document.
 11 |     
 12 |     Args:
 13 |         doc: Document object
 14 |     """
 15 |     for i in range(1, 10):  # Create Heading 1 through Heading 9
 16 |         style_name = f'Heading {i}'
 17 |         try:
 18 |             # Try to access the style to see if it exists
 19 |             style = doc.styles[style_name]
 20 |         except KeyError:
 21 |             # Create the style if it doesn't exist
 22 |             try:
 23 |                 style = doc.styles.add_style(style_name, WD_STYLE_TYPE.PARAGRAPH)
 24 |                 if i == 1:
 25 |                     style.font.size = Pt(16)
 26 |                     style.font.bold = True
 27 |                 elif i == 2:
 28 |                     style.font.size = Pt(14)
 29 |                     style.font.bold = True
 30 |                 else:
 31 |                     style.font.size = Pt(12)
 32 |                     style.font.bold = True
 33 |             except Exception:
 34 |                 # If style creation fails, we'll just use default formatting
 35 |                 pass
 36 | 
 37 | 
 38 | def ensure_table_style(doc):
 39 |     """
 40 |     Ensure Table Grid style exists in the document.
 41 |     
 42 |     Args:
 43 |         doc: Document object
 44 |     """
 45 |     try:
 46 |         # Try to access the style to see if it exists
 47 |         style = doc.styles['Table Grid']
 48 |     except KeyError:
 49 |         # If style doesn't exist, we'll handle it at usage time
 50 |         pass
 51 | 
 52 | 
 53 | def create_style(doc, style_name, style_type, base_style=None, font_properties=None, paragraph_properties=None):
 54 |     """
 55 |     Create a new style in the document.
 56 |     
 57 |     Args:
 58 |         doc: Document object
 59 |         style_name: Name for the new style
 60 |         style_type: Type of style (WD_STYLE_TYPE)
 61 |         base_style: Optional base style to inherit from
 62 |         font_properties: Dictionary of font properties (bold, italic, size, name, color)
 63 |         paragraph_properties: Dictionary of paragraph properties (alignment, spacing)
 64 |         
 65 |     Returns:
 66 |         The created style
 67 |     """
 68 |     from docx.shared import Pt
 69 |     
 70 |     try:
 71 |         # Check if style already exists
 72 |         style = doc.styles.get_by_id(style_name, WD_STYLE_TYPE.PARAGRAPH)
 73 |         return style
 74 |     except:
 75 |         # Create new style
 76 |         new_style = doc.styles.add_style(style_name, style_type)
 77 |         
 78 |         # Set base style if specified
 79 |         if base_style:
 80 |             new_style.base_style = doc.styles[base_style]
 81 |         
 82 |         # Set font properties
 83 |         if font_properties:
 84 |             font = new_style.font
 85 |             if 'bold' in font_properties:
 86 |                 font.bold = font_properties['bold']
 87 |             if 'italic' in font_properties:
 88 |                 font.italic = font_properties['italic']
 89 |             if 'size' in font_properties:
 90 |                 font.size = Pt(font_properties['size'])
 91 |             if 'name' in font_properties:
 92 |                 font.name = font_properties['name']
 93 |             if 'color' in font_properties:
 94 |                 from docx.shared import RGBColor
 95 |                 
 96 |                 # Define common RGB colors
 97 |                 color_map = {
 98 |                     'red': RGBColor(255, 0, 0),
 99 |                     'blue': RGBColor(0, 0, 255),
100 |                     'green': RGBColor(0, 128, 0),
101 |                     'yellow': RGBColor(255, 255, 0),
102 |                     'black': RGBColor(0, 0, 0),
103 |                     'gray': RGBColor(128, 128, 128),
104 |                     'white': RGBColor(255, 255, 255),
105 |                     'purple': RGBColor(128, 0, 128),
106 |                     'orange': RGBColor(255, 165, 0)
107 |                 }
108 |                 
109 |                 color_value = font_properties['color']
110 |                 try:
111 |                     # Handle string color names
112 |                     if isinstance(color_value, str) and color_value.lower() in color_map:
113 |                         font.color.rgb = color_map[color_value.lower()]
114 |                     # Handle RGBColor objects
115 |                     elif hasattr(color_value, 'rgb'):
116 |                         font.color.rgb = color_value
117 |                     # Try to parse as RGB string
118 |                     elif isinstance(color_value, str):
119 |                         font.color.rgb = RGBColor.from_string(color_value)
120 |                     # Use directly if it's already an RGB value
121 |                     else:
122 |                         font.color.rgb = color_value
123 |                 except Exception as e:
124 |                     # Fallback to black if all else fails
125 |                     font.color.rgb = RGBColor(0, 0, 0)
126 |         
127 |         # Set paragraph properties
128 |         if paragraph_properties:
129 |             if 'alignment' in paragraph_properties:
130 |                 new_style.paragraph_format.alignment = paragraph_properties['alignment']
131 |             if 'spacing' in paragraph_properties:
132 |                 new_style.paragraph_format.line_spacing = paragraph_properties['spacing']
133 |         
134 |         return new_style
135 | 
```

--------------------------------------------------------------------------------
/word_document_server/tools/comment_tools.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Comment extraction tools for Word Document Server.
  3 | 
  4 | These tools provide high-level interfaces for extracting and analyzing
  5 | comments from Word documents through the MCP protocol.
  6 | """
  7 | import os
  8 | import json
  9 | from typing import Dict, List, Optional, Any
 10 | from docx import Document
 11 | 
 12 | from word_document_server.utils.file_utils import ensure_docx_extension
 13 | from word_document_server.core.comments import (
 14 |     extract_all_comments,
 15 |     filter_comments_by_author,
 16 |     get_comments_for_paragraph
 17 | )
 18 | 
 19 | 
 20 | async def get_all_comments(filename: str) -> str:
 21 |     """
 22 |     Extract all comments from a Word document.
 23 |     
 24 |     Args:
 25 |         filename: Path to the Word document
 26 |         
 27 |     Returns:
 28 |         JSON string containing all comments with metadata
 29 |     """
 30 |     filename = ensure_docx_extension(filename)
 31 |     
 32 |     if not os.path.exists(filename):
 33 |         return json.dumps({
 34 |             'success': False,
 35 |             'error': f'Document {filename} does not exist'
 36 |         }, indent=2)
 37 |     
 38 |     try:
 39 |         # Load the document
 40 |         doc = Document(filename)
 41 |         
 42 |         # Extract all comments
 43 |         comments = extract_all_comments(doc)
 44 |         
 45 |         # Return results
 46 |         return json.dumps({
 47 |             'success': True,
 48 |             'comments': comments,
 49 |             'total_comments': len(comments)
 50 |         }, indent=2)
 51 |         
 52 |     except Exception as e:
 53 |         return json.dumps({
 54 |             'success': False,
 55 |             'error': f'Failed to extract comments: {str(e)}'
 56 |         }, indent=2)
 57 | 
 58 | 
 59 | async def get_comments_by_author(filename: str, author: str) -> str:
 60 |     """
 61 |     Extract comments from a specific author in a Word document.
 62 |     
 63 |     Args:
 64 |         filename: Path to the Word document
 65 |         author: Name of the comment author to filter by
 66 |         
 67 |     Returns:
 68 |         JSON string containing filtered comments
 69 |     """
 70 |     filename = ensure_docx_extension(filename)
 71 |     
 72 |     if not os.path.exists(filename):
 73 |         return json.dumps({
 74 |             'success': False,
 75 |             'error': f'Document {filename} does not exist'
 76 |         }, indent=2)
 77 |     
 78 |     if not author or not author.strip():
 79 |         return json.dumps({
 80 |             'success': False,
 81 |             'error': 'Author name cannot be empty'
 82 |         }, indent=2)
 83 |     
 84 |     try:
 85 |         # Load the document
 86 |         doc = Document(filename)
 87 |         
 88 |         # Extract all comments
 89 |         all_comments = extract_all_comments(doc)
 90 |         
 91 |         # Filter by author
 92 |         author_comments = filter_comments_by_author(all_comments, author)
 93 |         
 94 |         # Return results
 95 |         return json.dumps({
 96 |             'success': True,
 97 |             'author': author,
 98 |             'comments': author_comments,
 99 |             'total_comments': len(author_comments)
100 |         }, indent=2)
101 |         
102 |     except Exception as e:
103 |         return json.dumps({
104 |             'success': False,
105 |             'error': f'Failed to extract comments: {str(e)}'
106 |         }, indent=2)
107 | 
108 | 
109 | async def get_comments_for_paragraph(filename: str, paragraph_index: int) -> str:
110 |     """
111 |     Extract comments for a specific paragraph in a Word document.
112 |     
113 |     Args:
114 |         filename: Path to the Word document
115 |         paragraph_index: Index of the paragraph (0-based)
116 |         
117 |     Returns:
118 |         JSON string containing comments for the specified paragraph
119 |     """
120 |     filename = ensure_docx_extension(filename)
121 |     
122 |     if not os.path.exists(filename):
123 |         return json.dumps({
124 |             'success': False,
125 |             'error': f'Document {filename} does not exist'
126 |         }, indent=2)
127 |     
128 |     if paragraph_index < 0:
129 |         return json.dumps({
130 |             'success': False,
131 |             'error': 'Paragraph index must be non-negative'
132 |         }, indent=2)
133 |     
134 |     try:
135 |         # Load the document
136 |         doc = Document(filename)
137 |         
138 |         # Check if paragraph index is valid
139 |         if paragraph_index >= len(doc.paragraphs):
140 |             return json.dumps({
141 |                 'success': False,
142 |                 'error': f'Paragraph index {paragraph_index} is out of range. Document has {len(doc.paragraphs)} paragraphs.'
143 |             }, indent=2)
144 |         
145 |         # Extract all comments
146 |         all_comments = extract_all_comments(doc)
147 |         
148 |         # Filter for the specific paragraph
149 |         from word_document_server.core.comments import get_comments_for_paragraph as core_get_comments_for_paragraph
150 |         para_comments = core_get_comments_for_paragraph(all_comments, paragraph_index)
151 |         
152 |         # Get the paragraph text for context
153 |         paragraph_text = doc.paragraphs[paragraph_index].text
154 |         
155 |         # Return results
156 |         return json.dumps({
157 |             'success': True,
158 |             'paragraph_index': paragraph_index,
159 |             'paragraph_text': paragraph_text,
160 |             'comments': para_comments,
161 |             'total_comments': len(para_comments)
162 |         }, indent=2)
163 |         
164 |     except Exception as e:
165 |         return json.dumps({
166 |             'success': False,
167 |             'error': f'Failed to extract comments: {str(e)}'
168 |         }, indent=2)
```

--------------------------------------------------------------------------------
/word_document_server/utils/extended_document_utils.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Extended document utilities for Word Document Server.
  3 | """
  4 | from typing import Dict, List, Any, Tuple
  5 | from docx import Document
  6 | 
  7 | 
  8 | def get_paragraph_text(doc_path: str, paragraph_index: int) -> Dict[str, Any]:
  9 |     """
 10 |     Get text from a specific paragraph in a Word document.
 11 |     
 12 |     Args:
 13 |         doc_path: Path to the Word document
 14 |         paragraph_index: Index of the paragraph to extract (0-based)
 15 |     
 16 |     Returns:
 17 |         Dictionary with paragraph text and metadata
 18 |     """
 19 |     import os
 20 |     if not os.path.exists(doc_path):
 21 |         return {"error": f"Document {doc_path} does not exist"}
 22 |     
 23 |     try:
 24 |         doc = Document(doc_path)
 25 |         
 26 |         # Check if paragraph index is valid
 27 |         if paragraph_index < 0 or paragraph_index >= len(doc.paragraphs):
 28 |             return {"error": f"Invalid paragraph index: {paragraph_index}. Document has {len(doc.paragraphs)} paragraphs."}
 29 |         
 30 |         paragraph = doc.paragraphs[paragraph_index]
 31 |         
 32 |         return {
 33 |             "index": paragraph_index,
 34 |             "text": paragraph.text,
 35 |             "style": paragraph.style.name if paragraph.style else "Normal",
 36 |             "is_heading": paragraph.style.name.startswith("Heading") if paragraph.style else False
 37 |         }
 38 |     except Exception as e:
 39 |         return {"error": f"Failed to get paragraph text: {str(e)}"}
 40 | 
 41 | 
 42 | def find_text(doc_path: str, text_to_find: str, match_case: bool = True, whole_word: bool = False) -> Dict[str, Any]:
 43 |     """
 44 |     Find all occurrences of specific text in a Word document.
 45 |     
 46 |     Args:
 47 |         doc_path: Path to the Word document
 48 |         text_to_find: Text to search for
 49 |         match_case: Whether to perform case-sensitive search
 50 |         whole_word: Whether to match whole words only
 51 |     
 52 |     Returns:
 53 |         Dictionary with search results
 54 |     """
 55 |     import os
 56 |     if not os.path.exists(doc_path):
 57 |         return {"error": f"Document {doc_path} does not exist"}
 58 |     
 59 |     if not text_to_find:
 60 |         return {"error": "Search text cannot be empty"}
 61 |     
 62 |     try:
 63 |         doc = Document(doc_path)
 64 |         results = {
 65 |             "query": text_to_find,
 66 |             "match_case": match_case,
 67 |             "whole_word": whole_word,
 68 |             "occurrences": [],
 69 |             "total_count": 0
 70 |         }
 71 |         
 72 |         # Search in paragraphs
 73 |         for i, para in enumerate(doc.paragraphs):
 74 |             # Prepare text for comparison
 75 |             para_text = para.text
 76 |             search_text = text_to_find
 77 |             
 78 |             if not match_case:
 79 |                 para_text = para_text.lower()
 80 |                 search_text = search_text.lower()
 81 |             
 82 |             # Find all occurrences (simple implementation)
 83 |             start_pos = 0
 84 |             while True:
 85 |                 if whole_word:
 86 |                     # For whole word search, we need to check word boundaries
 87 |                     words = para_text.split()
 88 |                     found = False
 89 |                     for word_idx, word in enumerate(words):
 90 |                         if (word == search_text or 
 91 |                             (not match_case and word.lower() == search_text.lower())):
 92 |                             results["occurrences"].append({
 93 |                                 "paragraph_index": i,
 94 |                                 "position": word_idx,
 95 |                                 "context": para.text[:100] + ("..." if len(para.text) > 100 else "")
 96 |                             })
 97 |                             results["total_count"] += 1
 98 |                             found = True
 99 |                     
100 |                     # Break after checking all words
101 |                     break
102 |                 else:
103 |                     # For substring search
104 |                     pos = para_text.find(search_text, start_pos)
105 |                     if pos == -1:
106 |                         break
107 |                     
108 |                     results["occurrences"].append({
109 |                         "paragraph_index": i,
110 |                         "position": pos,
111 |                         "context": para.text[:100] + ("..." if len(para.text) > 100 else "")
112 |                     })
113 |                     results["total_count"] += 1
114 |                     start_pos = pos + len(search_text)
115 |         
116 |         # Search in tables
117 |         for table_idx, table in enumerate(doc.tables):
118 |             for row_idx, row in enumerate(table.rows):
119 |                 for col_idx, cell in enumerate(row.cells):
120 |                     for para_idx, para in enumerate(cell.paragraphs):
121 |                         # Prepare text for comparison
122 |                         para_text = para.text
123 |                         search_text = text_to_find
124 |                         
125 |                         if not match_case:
126 |                             para_text = para_text.lower()
127 |                             search_text = search_text.lower()
128 |                         
129 |                         # Find all occurrences (simple implementation)
130 |                         start_pos = 0
131 |                         while True:
132 |                             if whole_word:
133 |                                 # For whole word search, check word boundaries
134 |                                 words = para_text.split()
135 |                                 found = False
136 |                                 for word_idx, word in enumerate(words):
137 |                                     if (word == search_text or 
138 |                                         (not match_case and word.lower() == search_text.lower())):
139 |                                         results["occurrences"].append({
140 |                                             "location": f"Table {table_idx}, Row {row_idx}, Column {col_idx}",
141 |                                             "position": word_idx,
142 |                                             "context": para.text[:100] + ("..." if len(para.text) > 100 else "")
143 |                                         })
144 |                                         results["total_count"] += 1
145 |                                         found = True
146 |                                 
147 |                                 # Break after checking all words
148 |                                 break
149 |                             else:
150 |                                 # For substring search
151 |                                 pos = para_text.find(search_text, start_pos)
152 |                                 if pos == -1:
153 |                                     break
154 |                                 
155 |                                 results["occurrences"].append({
156 |                                     "location": f"Table {table_idx}, Row {row_idx}, Column {col_idx}",
157 |                                     "position": pos,
158 |                                     "context": para.text[:100] + ("..." if len(para.text) > 100 else "")
159 |                                 })
160 |                                 results["total_count"] += 1
161 |                                 start_pos = pos + len(search_text)
162 |         
163 |         return results
164 |     except Exception as e:
165 |         return {"error": f"Failed to search for text: {str(e)}"}
166 | 
```

--------------------------------------------------------------------------------
/word_document_server/core/comments.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Core comment extraction functionality for Word documents.
  3 | 
  4 | This module provides low-level functions to extract and process comments
  5 | from Word documents using the python-docx library.
  6 | """
  7 | import datetime
  8 | from typing import Dict, List, Optional, Any
  9 | from docx import Document
 10 | from docx.document import Document as DocumentType
 11 | from docx.text.paragraph import Paragraph
 12 | 
 13 | 
 14 | def extract_all_comments(doc: DocumentType) -> List[Dict[str, Any]]:
 15 |     """
 16 |     Extract all comments from a Word document.
 17 |     
 18 |     Args:
 19 |         doc: The Document object to extract comments from
 20 |         
 21 |     Returns:
 22 |         List of dictionaries containing comment information
 23 |     """
 24 |     comments = []
 25 |     
 26 |     # Access the document's comment part if it exists
 27 |     try:
 28 |         # Get the document part
 29 |         document_part = doc.part
 30 |         
 31 |         # Find comments part through relationships
 32 |         comments_part = None
 33 |         for rel_id, rel in document_part.rels.items():
 34 |             if 'comments' in rel.reltype and 'comments' == rel.reltype.split('/')[-1]:
 35 |                 comments_part = rel.target_part
 36 |                 break
 37 |         
 38 |         if comments_part:
 39 |             # Extract comments from the comments part using proper xpath syntax
 40 |             comment_elements = comments_part.element.xpath('.//w:comment')
 41 |             
 42 |             for idx, comment_element in enumerate(comment_elements):
 43 |                 comment_data = extract_comment_data(comment_element, idx)
 44 |                 if comment_data:
 45 |                     comments.append(comment_data)
 46 |         
 47 |         # If no comments found, try alternative approach
 48 |         if not comments:
 49 |             # Fallback: scan paragraphs for comment references
 50 |             comments = extract_comments_from_paragraphs(doc)
 51 |     
 52 |     except Exception as e:
 53 |         # If direct access fails, try alternative approach
 54 |         comments = extract_comments_from_paragraphs(doc)
 55 |     
 56 |     return comments
 57 | 
 58 | 
 59 | def extract_comments_from_paragraphs(doc: DocumentType) -> List[Dict[str, Any]]:
 60 |     """
 61 |     Extract comments by scanning paragraphs for comment references.
 62 |     
 63 |     Args:
 64 |         doc: The Document object
 65 |         
 66 |     Returns:
 67 |         List of comment dictionaries
 68 |     """
 69 |     comments = []
 70 |     comment_id = 1
 71 |     
 72 |     # Check all paragraphs in the document
 73 |     for para_idx, paragraph in enumerate(doc.paragraphs):
 74 |         para_comments = find_paragraph_comments(paragraph, para_idx, comment_id)
 75 |         comments.extend(para_comments)
 76 |         comment_id += len(para_comments)
 77 |     
 78 |     # Check paragraphs in tables
 79 |     for table in doc.tables:
 80 |         for row in table.rows:
 81 |             for cell in row.cells:
 82 |                 for para_idx, paragraph in enumerate(cell.paragraphs):
 83 |                     para_comments = find_paragraph_comments(paragraph, para_idx, comment_id, in_table=True)
 84 |                     comments.extend(para_comments)
 85 |                     comment_id += len(para_comments)
 86 |     
 87 |     return comments
 88 | 
 89 | 
 90 | def extract_comment_data(comment_element, index: int) -> Optional[Dict[str, Any]]:
 91 |     """
 92 |     Extract data from a comment XML element.
 93 |     
 94 |     Args:
 95 |         comment_element: The XML comment element
 96 |         index: Index for generating a unique ID
 97 |         
 98 |     Returns:
 99 |         Dictionary with comment data or None
100 |     """
101 |     try:
102 |         # Extract comment attributes
103 |         comment_id = comment_element.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}id', str(index))
104 |         author = comment_element.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}author', 'Unknown')
105 |         initials = comment_element.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}initials', '')
106 |         date_str = comment_element.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}date', '')
107 |         
108 |         # Parse date if available
109 |         date = None
110 |         if date_str:
111 |             try:
112 |                 date = datetime.datetime.fromisoformat(date_str.replace('Z', '+00:00'))
113 |                 date = date.isoformat()
114 |             except:
115 |                 date = date_str
116 |         
117 |         # Extract comment text
118 |         text_elements = comment_element.xpath('.//w:t')
119 |         text = ''.join(elem.text or '' for elem in text_elements)
120 |         
121 |         return {
122 |             'id': f'comment_{index + 1}',
123 |             'comment_id': comment_id,
124 |             'author': author,
125 |             'initials': initials,
126 |             'date': date,
127 |             'text': text.strip(),
128 |             'paragraph_index': None,  # Will be filled if we can determine it
129 |             'in_table': False,
130 |             'reference_text': ''
131 |         }
132 |     
133 |     except Exception as e:
134 |         return None
135 | 
136 | 
137 | def find_paragraph_comments(paragraph: Paragraph, para_index: int, 
138 |                            start_id: int, in_table: bool = False) -> List[Dict[str, Any]]:
139 |     """
140 |     Find comments associated with a specific paragraph.
141 |     
142 |     Args:
143 |         paragraph: The paragraph to check
144 |         para_index: The index of the paragraph
145 |         start_id: Starting ID for comments
146 |         in_table: Whether the paragraph is in a table
147 |         
148 |     Returns:
149 |         List of comment dictionaries
150 |     """
151 |     comments = []
152 |     
153 |     try:
154 |         # Access the paragraph's XML element
155 |         para_xml = paragraph._element
156 |         
157 |         # Look for comment range markers (simplified approach)
158 |         # This is a basic implementation - the full version would need more sophisticated XML parsing
159 |         xml_text = str(para_xml)
160 |         
161 |         # Simple check for comment references in the XML
162 |         if 'commentRangeStart' in xml_text or 'commentReference' in xml_text:
163 |             # Create a placeholder comment entry
164 |             comment_info = {
165 |                 'id': f'comment_{start_id}',
166 |                 'comment_id': f'{start_id}',
167 |                 'author': 'Unknown',
168 |                 'initials': '',
169 |                 'date': None,
170 |                 'text': 'Comment detected but content not accessible',
171 |                 'paragraph_index': para_index,
172 |                 'in_table': in_table,
173 |                 'reference_text': paragraph.text[:50] + '...' if len(paragraph.text) > 50 else paragraph.text
174 |             }
175 |             comments.append(comment_info)
176 |     
177 |     except Exception:
178 |         # If we can't access the XML, skip this paragraph
179 |         pass
180 |     
181 |     return comments
182 | 
183 | 
184 | def filter_comments_by_author(comments: List[Dict[str, Any]], author: str) -> List[Dict[str, Any]]:
185 |     """
186 |     Filter comments by author name.
187 |     
188 |     Args:
189 |         comments: List of comment dictionaries
190 |         author: Author name to filter by (case-insensitive)
191 |         
192 |     Returns:
193 |         Filtered list of comments
194 |     """
195 |     author_lower = author.lower()
196 |     return [c for c in comments if c.get('author', '').lower() == author_lower]
197 | 
198 | 
199 | def get_comments_for_paragraph(comments: List[Dict[str, Any]], paragraph_index: int) -> List[Dict[str, Any]]:
200 |     """
201 |     Get all comments for a specific paragraph.
202 |     
203 |     Args:
204 |         comments: List of all comments
205 |         paragraph_index: Index of the paragraph
206 |         
207 |     Returns:
208 |         Comments for the specified paragraph
209 |     """
210 |     return [c for c in comments if c.get('paragraph_index') == paragraph_index]
```

--------------------------------------------------------------------------------
/word_document_server/tools/document_tools.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Document creation and manipulation tools for Word Document Server.
  3 | """
  4 | import os
  5 | import json
  6 | from typing import Dict, List, Optional, Any
  7 | from docx import Document
  8 | 
  9 | from word_document_server.utils.file_utils import check_file_writeable, ensure_docx_extension, create_document_copy
 10 | from word_document_server.utils.document_utils import get_document_properties, extract_document_text, get_document_structure, get_document_xml, insert_header_near_text, insert_line_or_paragraph_near_text
 11 | from word_document_server.core.styles import ensure_heading_style, ensure_table_style
 12 | 
 13 | 
 14 | async def create_document(filename: str, title: Optional[str] = None, author: Optional[str] = None) -> str:
 15 |     """Create a new Word document with optional metadata.
 16 |     
 17 |     Args:
 18 |         filename: Name of the document to create (with or without .docx extension)
 19 |         title: Optional title for the document metadata
 20 |         author: Optional author for the document metadata
 21 |     """
 22 |     filename = ensure_docx_extension(filename)
 23 |     
 24 |     # Check if file is writeable
 25 |     is_writeable, error_message = check_file_writeable(filename)
 26 |     if not is_writeable:
 27 |         return f"Cannot create document: {error_message}"
 28 |     
 29 |     try:
 30 |         doc = Document()
 31 |         
 32 |         # Set properties if provided
 33 |         if title:
 34 |             doc.core_properties.title = title
 35 |         if author:
 36 |             doc.core_properties.author = author
 37 |         
 38 |         # Ensure necessary styles exist
 39 |         ensure_heading_style(doc)
 40 |         ensure_table_style(doc)
 41 |         
 42 |         # Save the document
 43 |         doc.save(filename)
 44 |         
 45 |         return f"Document {filename} created successfully"
 46 |     except Exception as e:
 47 |         return f"Failed to create document: {str(e)}"
 48 | 
 49 | 
 50 | async def get_document_info(filename: str) -> str:
 51 |     """Get information about a Word document.
 52 |     
 53 |     Args:
 54 |         filename: Path to the Word document
 55 |     """
 56 |     filename = ensure_docx_extension(filename)
 57 |     
 58 |     if not os.path.exists(filename):
 59 |         return f"Document {filename} does not exist"
 60 |     
 61 |     try:
 62 |         properties = get_document_properties(filename)
 63 |         return json.dumps(properties, indent=2)
 64 |     except Exception as e:
 65 |         return f"Failed to get document info: {str(e)}"
 66 | 
 67 | 
 68 | async def get_document_text(filename: str) -> str:
 69 |     """Extract all text from a Word document.
 70 |     
 71 |     Args:
 72 |         filename: Path to the Word document
 73 |     """
 74 |     filename = ensure_docx_extension(filename)
 75 |     
 76 |     return extract_document_text(filename)
 77 | 
 78 | 
 79 | async def get_document_outline(filename: str) -> str:
 80 |     """Get the structure of a Word document.
 81 |     
 82 |     Args:
 83 |         filename: Path to the Word document
 84 |     """
 85 |     filename = ensure_docx_extension(filename)
 86 |     
 87 |     structure = get_document_structure(filename)
 88 |     return json.dumps(structure, indent=2)
 89 | 
 90 | 
 91 | async def list_available_documents(directory: str = ".") -> str:
 92 |     """List all .docx files in the specified directory.
 93 |     
 94 |     Args:
 95 |         directory: Directory to search for Word documents
 96 |     """
 97 |     try:
 98 |         if not os.path.exists(directory):
 99 |             return f"Directory {directory} does not exist"
100 |         
101 |         docx_files = [f for f in os.listdir(directory) if f.endswith('.docx')]
102 |         
103 |         if not docx_files:
104 |             return f"No Word documents found in {directory}"
105 |         
106 |         result = f"Found {len(docx_files)} Word documents in {directory}:\n"
107 |         for file in docx_files:
108 |             file_path = os.path.join(directory, file)
109 |             size = os.path.getsize(file_path) / 1024  # KB
110 |             result += f"- {file} ({size:.2f} KB)\n"
111 |         
112 |         return result
113 |     except Exception as e:
114 |         return f"Failed to list documents: {str(e)}"
115 | 
116 | 
117 | async def copy_document(source_filename: str, destination_filename: Optional[str] = None) -> str:
118 |     """Create a copy of a Word document.
119 |     
120 |     Args:
121 |         source_filename: Path to the source document
122 |         destination_filename: Optional path for the copy. If not provided, a default name will be generated.
123 |     """
124 |     source_filename = ensure_docx_extension(source_filename)
125 |     
126 |     if destination_filename:
127 |         destination_filename = ensure_docx_extension(destination_filename)
128 |     
129 |     success, message, new_path = create_document_copy(source_filename, destination_filename)
130 |     if success:
131 |         return message
132 |     else:
133 |         return f"Failed to copy document: {message}"
134 | 
135 | 
136 | async def merge_documents(target_filename: str, source_filenames: List[str], add_page_breaks: bool = True) -> str:
137 |     """Merge multiple Word documents into a single document.
138 |     
139 |     Args:
140 |         target_filename: Path to the target document (will be created or overwritten)
141 |         source_filenames: List of paths to source documents to merge
142 |         add_page_breaks: If True, add page breaks between documents
143 |     """
144 |     from word_document_server.core.tables import copy_table
145 |     
146 |     target_filename = ensure_docx_extension(target_filename)
147 |     
148 |     # Check if target file is writeable
149 |     is_writeable, error_message = check_file_writeable(target_filename)
150 |     if not is_writeable:
151 |         return f"Cannot create target document: {error_message}"
152 |     
153 |     # Validate all source documents exist
154 |     missing_files = []
155 |     for filename in source_filenames:
156 |         doc_filename = ensure_docx_extension(filename)
157 |         if not os.path.exists(doc_filename):
158 |             missing_files.append(doc_filename)
159 |     
160 |     if missing_files:
161 |         return f"Cannot merge documents. The following source files do not exist: {', '.join(missing_files)}"
162 |     
163 |     try:
164 |         # Create a new document for the merged result
165 |         target_doc = Document()
166 |         
167 |         # Process each source document
168 |         for i, filename in enumerate(source_filenames):
169 |             doc_filename = ensure_docx_extension(filename)
170 |             source_doc = Document(doc_filename)
171 |             
172 |             # Add page break between documents (except before the first one)
173 |             if add_page_breaks and i > 0:
174 |                 target_doc.add_page_break()
175 |             
176 |             # Copy all paragraphs
177 |             for paragraph in source_doc.paragraphs:
178 |                 # Create a new paragraph with the same text and style
179 |                 new_paragraph = target_doc.add_paragraph(paragraph.text)
180 |                 new_paragraph.style = target_doc.styles['Normal']  # Default style
181 |                 
182 |                 # Try to match the style if possible
183 |                 try:
184 |                     if paragraph.style and paragraph.style.name in target_doc.styles:
185 |                         new_paragraph.style = target_doc.styles[paragraph.style.name]
186 |                 except:
187 |                     pass
188 |                 
189 |                 # Copy run formatting
190 |                 for i, run in enumerate(paragraph.runs):
191 |                     if i < len(new_paragraph.runs):
192 |                         new_run = new_paragraph.runs[i]
193 |                         # Copy basic formatting
194 |                         new_run.bold = run.bold
195 |                         new_run.italic = run.italic
196 |                         new_run.underline = run.underline
197 |                         # Font size if specified
198 |                         if run.font.size:
199 |                             new_run.font.size = run.font.size
200 |             
201 |             # Copy all tables
202 |             for table in source_doc.tables:
203 |                 copy_table(table, target_doc)
204 |         
205 |         # Save the merged document
206 |         target_doc.save(target_filename)
207 |         return f"Successfully merged {len(source_filenames)} documents into {target_filename}"
208 |     except Exception as e:
209 |         return f"Failed to merge documents: {str(e)}"
210 | 
211 | 
212 | async def get_document_xml_tool(filename: str) -> str:
213 |     """Get the raw XML structure of a Word document."""
214 |     return get_document_xml(filename)
215 | 
```

--------------------------------------------------------------------------------
/word_document_server/tools/extended_document_tools.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Extended document tools for Word Document Server.
  3 | 
  4 | These tools provide enhanced document content extraction and search capabilities.
  5 | """
  6 | import os
  7 | import json
  8 | import subprocess
  9 | import platform
 10 | import shutil
 11 | from typing import Dict, List, Optional, Any, Union, Tuple
 12 | from docx import Document
 13 | 
 14 | from word_document_server.utils.file_utils import check_file_writeable, ensure_docx_extension
 15 | from word_document_server.utils.extended_document_utils import get_paragraph_text, find_text
 16 | 
 17 | 
 18 | async def get_paragraph_text_from_document(filename: str, paragraph_index: int) -> str:
 19 |     """Get text from a specific paragraph in a Word document.
 20 |     
 21 |     Args:
 22 |         filename: Path to the Word document
 23 |         paragraph_index: Index of the paragraph to retrieve (0-based)
 24 |     """
 25 |     filename = ensure_docx_extension(filename)
 26 |     
 27 |     if not os.path.exists(filename):
 28 |         return f"Document {filename} does not exist"
 29 |     
 30 | 
 31 |     if paragraph_index < 0:
 32 |         return "Invalid parameter: paragraph_index must be a non-negative integer"
 33 |     
 34 |     try:
 35 |         result = get_paragraph_text(filename, paragraph_index)
 36 |         return json.dumps(result, indent=2)
 37 |     except Exception as e:
 38 |         return f"Failed to get paragraph text: {str(e)}"
 39 | 
 40 | 
 41 | async def find_text_in_document(filename: str, text_to_find: str, match_case: bool = True, whole_word: bool = False) -> str:
 42 |     """Find occurrences of specific text in a Word document.
 43 |     
 44 |     Args:
 45 |         filename: Path to the Word document
 46 |         text_to_find: Text to search for in the document
 47 |         match_case: Whether to match case (True) or ignore case (False)
 48 |         whole_word: Whether to match whole words only (True) or substrings (False)
 49 |     """
 50 |     filename = ensure_docx_extension(filename)
 51 |     
 52 |     if not os.path.exists(filename):
 53 |         return f"Document {filename} does not exist"
 54 |     
 55 |     if not text_to_find:
 56 |         return "Search text cannot be empty"
 57 |     
 58 |     try:
 59 |         
 60 |         result = find_text(filename, text_to_find, match_case, whole_word)
 61 |         return json.dumps(result, indent=2)
 62 |     except Exception as e:
 63 |         return f"Failed to search for text: {str(e)}"
 64 | 
 65 | 
 66 | async def convert_to_pdf(filename: str, output_filename: Optional[str] = None) -> str:
 67 |     """Convert a Word document to PDF format.
 68 |     
 69 |     Args:
 70 |         filename: Path to the Word document
 71 |         output_filename: Optional path for the output PDF. If not provided, 
 72 |                          will use the same name with .pdf extension
 73 |     """
 74 |     filename = ensure_docx_extension(filename)
 75 |     
 76 |     if not os.path.exists(filename):
 77 |         return f"Document {filename} does not exist"
 78 |     
 79 |     # Generate output filename if not provided
 80 |     if not output_filename:
 81 |         base_name, _ = os.path.splitext(filename)
 82 |         output_filename = f"{base_name}.pdf"
 83 |     elif not output_filename.lower().endswith('.pdf'):
 84 |         output_filename = f"{output_filename}.pdf"
 85 |     
 86 |     # Convert to absolute path if not already
 87 |     if not os.path.isabs(output_filename):
 88 |         output_filename = os.path.abspath(output_filename)
 89 |     
 90 |     # Ensure the output directory exists
 91 |     output_dir = os.path.dirname(output_filename)
 92 |     if not output_dir:
 93 |         output_dir = os.path.abspath('.')
 94 |     
 95 |     # Create the directory if it doesn't exist
 96 |     os.makedirs(output_dir, exist_ok=True)
 97 |     
 98 |     # Check if output file can be written
 99 |     is_writeable, error_message = check_file_writeable(output_filename)
100 |     if not is_writeable:
101 |         return f"Cannot create PDF: {error_message} (Path: {output_filename}, Dir: {output_dir})"
102 |     
103 |     try:
104 |         # Determine platform for appropriate conversion method
105 |         system = platform.system()
106 |         
107 |         if system == "Windows":
108 |             # On Windows, try docx2pdf which uses Microsoft Word
109 |             try:
110 |                 from docx2pdf import convert
111 |                 convert(filename, output_filename)
112 |                 return f"Document successfully converted to PDF: {output_filename}"
113 |             except (ImportError, Exception) as e:
114 |                 return f"Failed to convert document to PDF: {str(e)}\nNote: docx2pdf requires Microsoft Word to be installed."
115 |                 
116 |         elif system in ["Linux", "Darwin"]:  # Linux or macOS
117 |             errors = []
118 |             
119 |             # --- Attempt 1: LibreOffice ---
120 |             lo_commands = []
121 |             if system == "Darwin":  # macOS
122 |                 lo_commands = ["soffice", "/Applications/LibreOffice.app/Contents/MacOS/soffice"]
123 |             else:  # Linux
124 |                 lo_commands = ["libreoffice", "soffice"]
125 | 
126 |             for cmd_name in lo_commands:
127 |                 try:
128 |                     output_dir_for_lo = os.path.dirname(output_filename) or '.'
129 |                     os.makedirs(output_dir_for_lo, exist_ok=True)
130 |                     
131 |                     cmd = [cmd_name, '--headless', '--convert-to', 'pdf', '--outdir', output_dir_for_lo, filename]
132 |                     result = subprocess.run(cmd, capture_output=True, text=True, timeout=60, check=False)
133 | 
134 |                     if result.returncode == 0:
135 |                         # LibreOffice typically creates a PDF with the same base name as the source file.
136 |                         # e.g., 'mydoc.docx' -> 'mydoc.pdf'
137 |                         base_name = os.path.splitext(os.path.basename(filename))[0]
138 |                         created_pdf_name = f"{base_name}.pdf"
139 |                         created_pdf_path = os.path.join(output_dir_for_lo, created_pdf_name)
140 | 
141 |                         # If the created file exists, move it to the desired output_filename if necessary.
142 |                         if os.path.exists(created_pdf_path):
143 |                             if created_pdf_path != output_filename:
144 |                                 shutil.move(created_pdf_path, output_filename)
145 |                             
146 |                             # Final check: does the target file now exist?
147 |                             if os.path.exists(output_filename):
148 |                                 return f"Document successfully converted to PDF via {cmd_name}: {output_filename}"
149 |                         
150 |                         # If we get here, soffice returned 0 but the expected file wasn't created.
151 |                         errors.append(f"{cmd_name} returned success code, but output file '{created_pdf_path}' was not found.")
152 |                         # Continue to the next command or fallback.
153 |                     else:
154 |                         errors.append(f"{cmd_name} failed. Stderr: {result.stderr.strip()}")
155 |                 except FileNotFoundError:
156 |                     errors.append(f"Command '{cmd_name}' not found.")
157 |                 except (subprocess.SubprocessError, Exception) as e:
158 |                     errors.append(f"An error occurred with {cmd_name}: {str(e)}")
159 |             
160 |             # --- Attempt 2: docx2pdf (Fallback) ---
161 |             try:
162 |                 from docx2pdf import convert
163 |                 convert(filename, output_filename)
164 |                 if os.path.exists(output_filename) and os.path.getsize(output_filename) > 0:
165 |                     return f"Document successfully converted to PDF via docx2pdf: {output_filename}"
166 |                 else:
167 |                     errors.append("docx2pdf fallback was executed but failed to create a valid output file.")
168 |             except ImportError:
169 |                 errors.append("docx2pdf is not installed, skipping fallback.")
170 |             except Exception as e:
171 |                 errors.append(f"docx2pdf fallback failed with an exception: {str(e)}")
172 | 
173 |             # --- If all attempts failed ---
174 |             error_summary = "Failed to convert document to PDF using all available methods.\n"
175 |             error_summary += "Recorded errors: " + "; ".join(errors) + "\n"
176 |             error_summary += "To convert documents to PDF, please install either:\n"
177 |             error_summary += "1. LibreOffice (recommended for Linux/macOS)\n"
178 |             error_summary += "2. Microsoft Word (required for docx2pdf on Windows/macOS)"
179 |             return error_summary
180 |         else:
181 |             return f"PDF conversion not supported on {system} platform"
182 |             
183 |     except Exception as e:
184 |         return f"Failed to convert document to PDF: {str(e)}"
185 | 
```

--------------------------------------------------------------------------------
/word_document_server/core/protection.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Document protection functionality for Word Document Server.
  3 | """
  4 | import os
  5 | import json
  6 | import hashlib
  7 | import datetime
  8 | from typing import Dict, List, Tuple, Optional, Any
  9 | 
 10 | 
 11 | def add_protection_info(doc_path: str, protection_type: str, password_hash: str, 
 12 |                         sections: Optional[List[str]] = None, 
 13 |                         signature_info: Optional[Dict[str, Any]] = None,
 14 |                         raw_password: Optional[str] = None) -> bool:
 15 |     """
 16 |     Add document protection information to a separate metadata file and encrypt the document.
 17 |     
 18 |     Args:
 19 |         doc_path: Path to the document
 20 |         protection_type: Type of protection ('password', 'restricted', 'signature')
 21 |         password_hash: Hashed password for security
 22 |         sections: List of section names that can be edited (for restricted editing)
 23 |         signature_info: Information about digital signature
 24 |         raw_password: The actual password for document encryption
 25 |         
 26 |     Returns:
 27 |         True if protection info was successfully added, False otherwise
 28 |     """
 29 |     # Create metadata filename based on document path
 30 |     base_path, _ = os.path.splitext(doc_path)
 31 |     metadata_path = f"{base_path}.protection"
 32 |     
 33 |     # Prepare protection data
 34 |     protection_data = {
 35 |         "type": protection_type,
 36 |         "password_hash": password_hash,
 37 |         "applied_date": datetime.datetime.now().isoformat(),
 38 |     }
 39 |     
 40 |     if sections:
 41 |         protection_data["editable_sections"] = sections
 42 |         
 43 |     if signature_info:
 44 |         protection_data["signature"] = signature_info
 45 |     
 46 |     # Write protection info to metadata file
 47 |     try:
 48 |         with open(metadata_path, 'w') as f:
 49 |             json.dump(protection_data, f, indent=2)
 50 |         
 51 |         # Apply actual document encryption if raw_password is provided
 52 |         if protection_type == "password" and raw_password:
 53 |             import msoffcrypto
 54 |             import tempfile
 55 |             import shutil
 56 |             
 57 |             # Create a temporary file for the encrypted output
 58 |             temp_fd, temp_path = tempfile.mkstemp(suffix='.docx')
 59 |             os.close(temp_fd)
 60 |             
 61 |             try:
 62 |                 # Open the document
 63 |                 with open(doc_path, 'rb') as f:
 64 |                     office_file = msoffcrypto.OfficeFile(f)
 65 |                     
 66 |                     # Encrypt with password
 67 |                     office_file.load_key(password=raw_password)
 68 |                     
 69 |                     # Write the encrypted file to the temp path
 70 |                     with open(temp_path, 'wb') as out_file:
 71 |                         office_file.encrypt(out_file)
 72 |                 
 73 |                 # Replace original with encrypted version
 74 |                 shutil.move(temp_path, doc_path)
 75 |                 
 76 |                 # Update metadata to note that true encryption was applied
 77 |                 protection_data["true_encryption"] = True
 78 |                 with open(metadata_path, 'w') as f:
 79 |                     json.dump(protection_data, f, indent=2)
 80 |                     
 81 |             except Exception as e:
 82 |                 print(f"Encryption error: {str(e)}")
 83 |                 if os.path.exists(temp_path):
 84 |                     os.unlink(temp_path)
 85 |                 return False
 86 |         
 87 |         return True
 88 |     except Exception as e:
 89 |         print(f"Protection error: {str(e)}")
 90 |         return False
 91 | 
 92 | 
 93 | def verify_document_protection(doc_path: str, password: Optional[str] = None) -> Tuple[bool, str]:
 94 |     """
 95 |     Verify if a document is protected and if the password is correct.
 96 |     
 97 |     Args:
 98 |         doc_path: Path to the document
 99 |         password: Password to verify
100 |     
101 |     Returns:
102 |         Tuple of (is_protected_and_verified, message)
103 |     """
104 |     base_path, _ = os.path.splitext(doc_path)
105 |     metadata_path = f"{base_path}.protection"
106 |     
107 |     # Check if protection metadata exists
108 |     if not os.path.exists(metadata_path):
109 |         return False, "Document is not protected"
110 |     
111 |     try:
112 |         # Read protection data
113 |         with open(metadata_path, 'r') as f:
114 |             protection_data = json.load(f)
115 |         
116 |         # If password is provided, verify it
117 |         if password:
118 |             password_hash = hashlib.sha256(password.encode()).hexdigest()
119 |             if password_hash != protection_data.get("password_hash"):
120 |                 return False, "Incorrect password"
121 |         
122 |         # Return protection type
123 |         protection_type = protection_data.get("type", "unknown")
124 |         return True, f"Document is protected with {protection_type} protection"
125 |         
126 |     except Exception as e:
127 |         return False, f"Error verifying protection: {str(e)}"
128 | 
129 | 
130 | def is_section_editable(doc_path: str, section_name: str) -> bool:
131 |     """
132 |     Check if a specific section of a document is editable.
133 |     
134 |     Args:
135 |         doc_path: Path to the document
136 |         section_name: Name of the section to check
137 |     
138 |     Returns:
139 |         True if section is editable, False otherwise
140 |     """
141 |     base_path, _ = os.path.splitext(doc_path)
142 |     metadata_path = f"{base_path}.protection"
143 |     
144 |     # Check if protection metadata exists
145 |     if not os.path.exists(metadata_path):
146 |         # If no protection exists, all sections are editable
147 |         return True
148 |     
149 |     try:
150 |         # Read protection data
151 |         with open(metadata_path, 'r') as f:
152 |             protection_data = json.load(f)
153 |         
154 |         # Check protection type
155 |         if protection_data.get("type") != "restricted":
156 |             # If not restricted editing, return based on protection type
157 |             return protection_data.get("type") != "password"
158 |         
159 |         # Check if the section is in the list of editable sections
160 |         editable_sections = protection_data.get("editable_sections", [])
161 |         return section_name in editable_sections
162 |         
163 |     except Exception:
164 |         # In case of error, default to not editable for security
165 |         return False
166 | 
167 | 
168 | def create_signature_info(doc, signer_name: str, reason: Optional[str] = None) -> Dict[str, Any]:
169 |     """
170 |     Create signature information for a document.
171 |     
172 |     Args:
173 |         doc: Document object
174 |         signer_name: Name of the person signing the document
175 |         reason: Optional reason for signing
176 |         
177 |     Returns:
178 |         Dictionary containing signature information
179 |     """
180 |     # Create signature info
181 |     signature_info = {
182 |         "signer": signer_name,
183 |         "timestamp": datetime.datetime.now().isoformat(),
184 |     }
185 |     
186 |     if reason:
187 |         signature_info["reason"] = reason
188 |     
189 |     # Generate a simple signature hash based on document content and metadata
190 |     text_content = "\n".join([p.text for p in doc.paragraphs])
191 |     content_hash = hashlib.sha256(text_content.encode()).hexdigest()
192 |     signature_info["content_hash"] = content_hash
193 |     
194 |     return signature_info
195 | 
196 | 
197 | def verify_signature(doc_path: str) -> Tuple[bool, str]:
198 |     """
199 |     Verify a document's digital signature.
200 |     
201 |     Args:
202 |         doc_path: Path to the document
203 |         
204 |     Returns:
205 |         Tuple of (is_valid, message)
206 |     """
207 |     from docx import Document
208 |     
209 |     base_path, _ = os.path.splitext(doc_path)
210 |     metadata_path = f"{base_path}.protection"
211 |     
212 |     if not os.path.exists(metadata_path):
213 |         return False, "Document is not signed"
214 |     
215 |     try:
216 |         # Read protection data
217 |         with open(metadata_path, 'r') as f:
218 |             protection_data = json.load(f)
219 |         
220 |         if protection_data.get("type") != "signature":
221 |             return False, f"Document is protected with {protection_data.get('type')} protection, not a signature"
222 |         
223 |         # Get the original content hash
224 |         signature_info = protection_data.get("signature", {})
225 |         original_hash = signature_info.get("content_hash")
226 |         
227 |         if not original_hash:
228 |             return False, "Invalid signature: missing content hash"
229 |         
230 |         # Calculate current content hash
231 |         doc = Document(doc_path)
232 |         text_content = "\n".join([p.text for p in doc.paragraphs])
233 |         current_hash = hashlib.sha256(text_content.encode()).hexdigest()
234 |         
235 |         # Compare hashes
236 |         if current_hash != original_hash:
237 |             return False, f"Document has been modified since it was signed by {signature_info.get('signer')}"
238 |         
239 |         return True, f"Document signature is valid. Signed by {signature_info.get('signer')} on {signature_info.get('timestamp')}"
240 |     
241 |     except Exception as e:
242 |         return False, f"Error verifying signature: {str(e)}"
243 | 
```

--------------------------------------------------------------------------------
/word_document_server/tools/protection_tools.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Protection tools for Word Document Server.
  3 | 
  4 | These tools handle document protection features such as
  5 | password protection, restricted editing, and digital signatures.
  6 | """
  7 | import os
  8 | import hashlib
  9 | import datetime
 10 | import io 
 11 | from typing import List, Optional, Dict, Any
 12 | from docx import Document
 13 | import msoffcrypto 
 14 | 
 15 | from word_document_server.utils.file_utils import check_file_writeable, ensure_docx_extension
 16 | 
 17 | 
 18 | 
 19 | from word_document_server.core.protection import (
 20 |     add_protection_info,
 21 |     verify_document_protection,
 22 |     create_signature_info
 23 | )
 24 | 
 25 | 
 26 | async def protect_document(filename: str, password: str) -> str:
 27 |     """Add password protection to a Word document.
 28 | 
 29 |     Args:
 30 |         filename: Path to the Word document
 31 |         password: Password to protect the document with
 32 |     """
 33 |     filename = ensure_docx_extension(filename)
 34 | 
 35 |     if not os.path.exists(filename):
 36 |         return f"Document {filename} does not exist"
 37 | 
 38 |     # Check if file is writeable
 39 |     is_writeable, error_message = check_file_writeable(filename)
 40 |     if not is_writeable:
 41 |         return f"Cannot protect document: {error_message}"
 42 | 
 43 |     try:
 44 |         # Read the original file content
 45 |         with open(filename, "rb") as infile:
 46 |             original_data = infile.read()
 47 | 
 48 |         # Create an msoffcrypto file object from the original data
 49 |         file = msoffcrypto.OfficeFile(io.BytesIO(original_data))
 50 |         file.load_key(password=password) # Set the password for encryption
 51 | 
 52 |         # Encrypt the data into an in-memory buffer
 53 |         encrypted_data_io = io.BytesIO()
 54 |         
 55 |         file.encrypt(password=password, outfile=encrypted_data_io) 
 56 | 
 57 |         # Overwrite the original file with the encrypted data
 58 |         with open(filename, "wb") as outfile:
 59 |             outfile.write(encrypted_data_io.getvalue())
 60 | 
 61 |         
 62 |         base_path, _ = os.path.splitext(filename)
 63 |         metadata_path = f"{base_path}.protection"
 64 |         if os.path.exists(metadata_path):
 65 |             os.remove(metadata_path)
 66 | 
 67 |         return f"Document {filename} encrypted successfully with password."
 68 | 
 69 |     except Exception as e:
 70 |         # Attempt to restore original file content on failure
 71 |         try:
 72 |             if 'original_data' in locals():
 73 |                 with open(filename, "wb") as outfile:
 74 |                     outfile.write(original_data)
 75 |                 return f"Failed to encrypt document {filename}: {str(e)}. Original file restored."
 76 |             else:
 77 |                  return f"Failed to encrypt document {filename}: {str(e)}. Could not restore original file."
 78 |         except Exception as restore_e:
 79 |              return f"Failed to encrypt document {filename}: {str(e)}. Also failed to restore original file: {str(restore_e)}"
 80 | 
 81 | 
 82 | async def add_restricted_editing(filename: str, password: str, editable_sections: List[str]) -> str:
 83 |     """Add restricted editing to a Word document, allowing editing only in specified sections.
 84 | 
 85 |     Args:
 86 |         filename: Path to the Word document
 87 |         password: Password to protect the document with
 88 |         editable_sections: List of section names that can be edited
 89 |     """
 90 |     filename = ensure_docx_extension(filename)
 91 | 
 92 |     if not os.path.exists(filename):
 93 |         return f"Document {filename} does not exist"
 94 | 
 95 |     # Check if file is writeable
 96 |     is_writeable, error_message = check_file_writeable(filename)
 97 |     if not is_writeable:
 98 |         return f"Cannot protect document: {error_message}"
 99 | 
100 |     try:
101 |         # Hash the password for security
102 |         password_hash = hashlib.sha256(password.encode()).hexdigest()
103 | 
104 |         # Add protection info to metadata
105 |         success = add_protection_info(
106 |             filename,
107 |             protection_type="restricted",
108 |             password_hash=password_hash,
109 |             sections=editable_sections
110 |         )
111 | 
112 |         if not editable_sections:
113 |             return "No editable sections specified. Document will be fully protected."
114 | 
115 |         if success:
116 |             return f"Document {filename} protected with restricted editing. Editable sections: {', '.join(editable_sections)}"
117 |         else:
118 |             return f"Failed to protect document {filename} with restricted editing"
119 |     except Exception as e:
120 |         return f"Failed to add restricted editing: {str(e)}"
121 | 
122 | async def add_digital_signature(filename: str, signer_name: str, reason: Optional[str] = None) -> str:
123 |     """Add a digital signature to a Word document.
124 | 
125 |     Args:
126 |         filename: Path to the Word document
127 |         signer_name: Name of the person signing the document
128 |         reason: Optional reason for signing
129 |     """
130 |     filename = ensure_docx_extension(filename)
131 | 
132 |     if not os.path.exists(filename):
133 |         return f"Document {filename} does not exist"
134 | 
135 |     # Check if file is writeable
136 |     is_writeable, error_message = check_file_writeable(filename)
137 |     if not is_writeable:
138 |         return f"Cannot add signature to document: {error_message}"
139 | 
140 |     try:
141 |         doc = Document(filename)
142 | 
143 |         # Create signature info
144 |         signature_info = create_signature_info(doc, signer_name, reason)
145 | 
146 |         # Add protection info to metadata
147 |         success = add_protection_info(
148 |             filename,
149 |             protection_type="signature",
150 |             password_hash="",  # No password for signature-only
151 |             signature_info=signature_info
152 |         )
153 | 
154 |         if success:
155 |             # Add a visible signature block to the document
156 |             doc.add_paragraph("").add_run()  # Add empty paragraph for spacing
157 |             signature_para = doc.add_paragraph()
158 |             signature_para.add_run(f"Digitally signed by: {signer_name}").bold = True
159 |             if reason:
160 |                 signature_para.add_run(f"\nReason: {reason}")
161 |             signature_para.add_run(f"\nDate: {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
162 |             signature_para.add_run(f"\nSignature ID: {signature_info['content_hash'][:8]}")
163 | 
164 |             # Save the document with the visible signature
165 |             doc.save(filename)
166 | 
167 |             return f"Digital signature added to document {filename}"
168 |         else:
169 |             return f"Failed to add digital signature to document {filename}"
170 |     except Exception as e:
171 |         return f"Failed to add digital signature: {str(e)}"
172 | 
173 | async def verify_document(filename: str, password: Optional[str] = None) -> str:
174 |     """Verify document protection and/or digital signature.
175 | 
176 |     Args:
177 |         filename: Path to the Word document
178 |         password: Optional password to verify
179 |     """
180 |     filename = ensure_docx_extension(filename)
181 | 
182 |     if not os.path.exists(filename):
183 |         return f"Document {filename} does not exist"
184 | 
185 |     try:
186 |         # Verify document protection
187 |         is_verified, message = verify_document_protection(filename, password)
188 | 
189 |         if not is_verified and password:
190 |             return f"Document verification failed: {message}"
191 | 
192 |         # If document has a digital signature, verify content integrity
193 |         base_path, _ = os.path.splitext(filename)
194 |         metadata_path = f"{base_path}.protection"
195 | 
196 |         if os.path.exists(metadata_path):
197 |             try:
198 |                 import json
199 |                 with open(metadata_path, 'r') as f:
200 |                     protection_data = json.load(f)
201 | 
202 |                 if protection_data.get("type") == "signature":
203 |                     # Get the original content hash
204 |                     signature_info = protection_data.get("signature", {})
205 |                     original_hash = signature_info.get("content_hash")
206 | 
207 |                     if original_hash:
208 |                         # Calculate current content hash
209 |                         doc = Document(filename)
210 |                         text_content = "\n".join([p.text for p in doc.paragraphs])
211 |                         current_hash = hashlib.sha256(text_content.encode()).hexdigest()
212 | 
213 |                         # Compare hashes
214 |                         if current_hash != original_hash:
215 |                             return f"Document has been modified since it was signed by {signature_info.get('signer')}"
216 |                         else:
217 |                             return f"Document signature is valid. Signed by {signature_info.get('signer')} on {signature_info.get('timestamp')}"
218 |             except Exception as e:
219 |                 return f"Error verifying signature: {str(e)}"
220 | 
221 |         return message
222 |     except Exception as e:
223 |         return f"Failed to verify document: {str(e)}"
224 | 
225 | async def unprotect_document(filename: str, password: str) -> str:
226 |     """Remove password protection from a Word document.
227 | 
228 |     Args:
229 |         filename: Path to the Word document
230 |         password: Password that was used to protect the document
231 |     """
232 |     filename = ensure_docx_extension(filename)
233 | 
234 |     if not os.path.exists(filename):
235 |         return f"Document {filename} does not exist"
236 | 
237 |     # Check if file is writeable
238 |     is_writeable, error_message = check_file_writeable(filename)
239 |     if not is_writeable:
240 |         return f"Cannot modify document: {error_message}"
241 | 
242 |     try:
243 |         # Read the encrypted file content
244 |         with open(filename, "rb") as infile:
245 |             encrypted_data = infile.read()
246 | 
247 |         # Create an msoffcrypto file object from the encrypted data
248 |         file = msoffcrypto.OfficeFile(io.BytesIO(encrypted_data))
249 |         file.load_key(password=password) # Set the password for decryption
250 | 
251 |         # Decrypt the data into an in-memory buffer
252 |         decrypted_data_io = io.BytesIO()
253 |         file.decrypt(outfile=decrypted_data_io) # Pass the buffer as the 'outfile' argument
254 | 
255 |         # Overwrite the original file with the decrypted data
256 |         with open(filename, "wb") as outfile:
257 |             outfile.write(decrypted_data_io.getvalue())
258 | 
259 |         return f"Document {filename} decrypted successfully."
260 | 
261 |     except msoffcrypto.exceptions.InvalidKeyError:
262 |          return f"Failed to decrypt document {filename}: Incorrect password."
263 |     except msoffcrypto.exceptions.InvalidFormatError:
264 |          return f"Failed to decrypt document {filename}: File is not encrypted or is not a supported Office format."
265 |     except Exception as e:
266 |         # Attempt to restore encrypted file content on failure
267 |         try:
268 |             if 'encrypted_data' in locals():
269 |                 with open(filename, "wb") as outfile:
270 |                     outfile.write(encrypted_data)
271 |                 return f"Failed to decrypt document {filename}: {str(e)}. Encrypted file restored."
272 |             else:
273 |                  return f"Failed to decrypt document {filename}: {str(e)}. Could not restore encrypted file."
274 |         except Exception as restore_e:
275 |              return f"Failed to decrypt document {filename}: {str(e)}. Also failed to restore encrypted file: {str(restore_e)}"
276 | 
```

--------------------------------------------------------------------------------
/word_document_server/tools/content_tools.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Content tools for Word Document Server.
  3 | 
  4 | These tools add various types of content to Word documents,
  5 | including headings, paragraphs, tables, images, and page breaks.
  6 | """
  7 | import os
  8 | from typing import List, Optional, Dict, Any
  9 | from docx import Document
 10 | from docx.shared import Inches, Pt, RGBColor
 11 | 
 12 | from word_document_server.utils.file_utils import check_file_writeable, ensure_docx_extension
 13 | from word_document_server.utils.document_utils import find_and_replace_text, insert_header_near_text, insert_numbered_list_near_text, insert_line_or_paragraph_near_text, replace_paragraph_block_below_header, replace_block_between_manual_anchors
 14 | from word_document_server.core.styles import ensure_heading_style, ensure_table_style
 15 | 
 16 | 
 17 | async def add_heading(filename: str, text: str, level: int = 1,
 18 |                       font_name: Optional[str] = None, font_size: Optional[int] = None,
 19 |                       bold: Optional[bool] = None, italic: Optional[bool] = None,
 20 |                       border_bottom: bool = False) -> str:
 21 |     """Add a heading to a Word document with optional formatting.
 22 | 
 23 |     Args:
 24 |         filename: Path to the Word document
 25 |         text: Heading text
 26 |         level: Heading level (1-9, where 1 is the highest level)
 27 |         font_name: Font family (e.g., 'Helvetica')
 28 |         font_size: Font size in points (e.g., 14)
 29 |         bold: True/False for bold text
 30 |         italic: True/False for italic text
 31 |         border_bottom: True to add bottom border (for section headers)
 32 |     """
 33 |     filename = ensure_docx_extension(filename)
 34 | 
 35 |     # Ensure level is converted to integer
 36 |     try:
 37 |         level = int(level)
 38 |     except (ValueError, TypeError):
 39 |         return "Invalid parameter: level must be an integer between 1 and 9"
 40 | 
 41 |     # Validate level range
 42 |     if level < 1 or level > 9:
 43 |         return f"Invalid heading level: {level}. Level must be between 1 and 9."
 44 | 
 45 |     if not os.path.exists(filename):
 46 |         return f"Document {filename} does not exist"
 47 | 
 48 |     # Check if file is writeable
 49 |     is_writeable, error_message = check_file_writeable(filename)
 50 |     if not is_writeable:
 51 |         # Suggest creating a copy
 52 |         return f"Cannot modify document: {error_message}. Consider creating a copy first or creating a new document."
 53 | 
 54 |     try:
 55 |         doc = Document(filename)
 56 | 
 57 |         # Ensure heading styles exist
 58 |         ensure_heading_style(doc)
 59 | 
 60 |         # Try to add heading with style
 61 |         try:
 62 |             heading = doc.add_heading(text, level=level)
 63 |         except Exception as style_error:
 64 |             # If style-based approach fails, use direct formatting
 65 |             heading = doc.add_paragraph(text)
 66 |             heading.style = doc.styles['Normal']
 67 |             if heading.runs:
 68 |                 run = heading.runs[0]
 69 |                 run.bold = True
 70 |                 # Adjust size based on heading level
 71 |                 if level == 1:
 72 |                     run.font.size = Pt(16)
 73 |                 elif level == 2:
 74 |                     run.font.size = Pt(14)
 75 |                 else:
 76 |                     run.font.size = Pt(12)
 77 | 
 78 |         # Apply formatting to all runs in the heading
 79 |         if any([font_name, font_size, bold is not None, italic is not None]):
 80 |             for run in heading.runs:
 81 |                 if font_name:
 82 |                     run.font.name = font_name
 83 |                 if font_size:
 84 |                     run.font.size = Pt(font_size)
 85 |                 if bold is not None:
 86 |                     run.font.bold = bold
 87 |                 if italic is not None:
 88 |                     run.font.italic = italic
 89 | 
 90 |         # Add bottom border if requested
 91 |         if border_bottom:
 92 |             from docx.oxml import OxmlElement
 93 |             from docx.oxml.ns import qn
 94 | 
 95 |             pPr = heading._element.get_or_add_pPr()
 96 |             pBdr = OxmlElement('w:pBdr')
 97 | 
 98 |             bottom = OxmlElement('w:bottom')
 99 |             bottom.set(qn('w:val'), 'single')
100 |             bottom.set(qn('w:sz'), '4')  # 0.5pt border
101 |             bottom.set(qn('w:space'), '0')
102 |             bottom.set(qn('w:color'), '000000')
103 | 
104 |             pBdr.append(bottom)
105 |             pPr.append(pBdr)
106 | 
107 |         doc.save(filename)
108 |         return f"Heading '{text}' (level {level}) added to {filename}"
109 |     except Exception as e:
110 |         return f"Failed to add heading: {str(e)}"
111 | 
112 | 
113 | async def add_paragraph(filename: str, text: str, style: Optional[str] = None,
114 |                         font_name: Optional[str] = None, font_size: Optional[int] = None,
115 |                         bold: Optional[bool] = None, italic: Optional[bool] = None,
116 |                         color: Optional[str] = None) -> str:
117 |     """Add a paragraph to a Word document with optional formatting.
118 | 
119 |     Args:
120 |         filename: Path to the Word document
121 |         text: Paragraph text
122 |         style: Optional paragraph style name
123 |         font_name: Font family (e.g., 'Helvetica', 'Times New Roman')
124 |         font_size: Font size in points (e.g., 14, 36)
125 |         bold: True/False for bold text
126 |         italic: True/False for italic text
127 |         color: RGB color as hex string (e.g., '000000' for black)
128 |     """
129 |     filename = ensure_docx_extension(filename)
130 | 
131 |     if not os.path.exists(filename):
132 |         return f"Document {filename} does not exist"
133 | 
134 |     # Check if file is writeable
135 |     is_writeable, error_message = check_file_writeable(filename)
136 |     if not is_writeable:
137 |         # Suggest creating a copy
138 |         return f"Cannot modify document: {error_message}. Consider creating a copy first or creating a new document."
139 | 
140 |     try:
141 |         doc = Document(filename)
142 |         paragraph = doc.add_paragraph(text)
143 | 
144 |         if style:
145 |             try:
146 |                 paragraph.style = style
147 |             except KeyError:
148 |                 # Style doesn't exist, use normal and report it
149 |                 paragraph.style = doc.styles['Normal']
150 |                 doc.save(filename)
151 |                 return f"Style '{style}' not found, paragraph added with default style to {filename}"
152 | 
153 |         # Apply formatting to all runs in the paragraph
154 |         if any([font_name, font_size, bold is not None, italic is not None, color]):
155 |             for run in paragraph.runs:
156 |                 if font_name:
157 |                     run.font.name = font_name
158 |                 if font_size:
159 |                     run.font.size = Pt(font_size)
160 |                 if bold is not None:
161 |                     run.font.bold = bold
162 |                 if italic is not None:
163 |                     run.font.italic = italic
164 |                 if color:
165 |                     # Remove any '#' prefix if present
166 |                     color_hex = color.lstrip('#')
167 |                     run.font.color.rgb = RGBColor.from_string(color_hex)
168 | 
169 |         doc.save(filename)
170 |         return f"Paragraph added to {filename}"
171 |     except Exception as e:
172 |         return f"Failed to add paragraph: {str(e)}"
173 | 
174 | 
175 | async def add_table(filename: str, rows: int, cols: int, data: Optional[List[List[str]]] = None) -> str:
176 |     """Add a table to a Word document.
177 |     
178 |     Args:
179 |         filename: Path to the Word document
180 |         rows: Number of rows in the table
181 |         cols: Number of columns in the table
182 |         data: Optional 2D array of data to fill the table
183 |     """
184 |     filename = ensure_docx_extension(filename)
185 |     
186 |     if not os.path.exists(filename):
187 |         return f"Document {filename} does not exist"
188 |     
189 |     # Check if file is writeable
190 |     is_writeable, error_message = check_file_writeable(filename)
191 |     if not is_writeable:
192 |         # Suggest creating a copy
193 |         return f"Cannot modify document: {error_message}. Consider creating a copy first or creating a new document."
194 |     
195 |     try:
196 |         doc = Document(filename)
197 |         table = doc.add_table(rows=rows, cols=cols)
198 |         
199 |         # Try to set the table style
200 |         try:
201 |             table.style = 'Table Grid'
202 |         except KeyError:
203 |             # If style doesn't exist, add basic borders
204 |             pass
205 |         
206 |         # Fill table with data if provided
207 |         if data:
208 |             for i, row_data in enumerate(data):
209 |                 if i >= rows:
210 |                     break
211 |                 for j, cell_text in enumerate(row_data):
212 |                     if j >= cols:
213 |                         break
214 |                     table.cell(i, j).text = str(cell_text)
215 |         
216 |         doc.save(filename)
217 |         return f"Table ({rows}x{cols}) added to {filename}"
218 |     except Exception as e:
219 |         return f"Failed to add table: {str(e)}"
220 | 
221 | 
222 | async def add_picture(filename: str, image_path: str, width: Optional[float] = None) -> str:
223 |     """Add an image to a Word document.
224 |     
225 |     Args:
226 |         filename: Path to the Word document
227 |         image_path: Path to the image file
228 |         width: Optional width in inches (proportional scaling)
229 |     """
230 |     filename = ensure_docx_extension(filename)
231 |     
232 |     # Validate document existence
233 |     if not os.path.exists(filename):
234 |         return f"Document {filename} does not exist"
235 |     
236 |     # Get absolute paths for better diagnostics
237 |     abs_filename = os.path.abspath(filename)
238 |     abs_image_path = os.path.abspath(image_path)
239 |     
240 |     # Validate image existence with improved error message
241 |     if not os.path.exists(abs_image_path):
242 |         return f"Image file not found: {abs_image_path}"
243 |     
244 |     # Check image file size
245 |     try:
246 |         image_size = os.path.getsize(abs_image_path) / 1024  # Size in KB
247 |         if image_size <= 0:
248 |             return f"Image file appears to be empty: {abs_image_path} (0 KB)"
249 |     except Exception as size_error:
250 |         return f"Error checking image file: {str(size_error)}"
251 |     
252 |     # Check if file is writeable
253 |     is_writeable, error_message = check_file_writeable(abs_filename)
254 |     if not is_writeable:
255 |         return f"Cannot modify document: {error_message}. Consider creating a copy first or creating a new document."
256 |     
257 |     try:
258 |         doc = Document(abs_filename)
259 |         # Additional diagnostic info
260 |         diagnostic = f"Attempting to add image ({abs_image_path}, {image_size:.2f} KB) to document ({abs_filename})"
261 |         
262 |         try:
263 |             if width:
264 |                 doc.add_picture(abs_image_path, width=Inches(width))
265 |             else:
266 |                 doc.add_picture(abs_image_path)
267 |             doc.save(abs_filename)
268 |             return f"Picture {image_path} added to {filename}"
269 |         except Exception as inner_error:
270 |             # More detailed error for the specific operation
271 |             error_type = type(inner_error).__name__
272 |             error_msg = str(inner_error)
273 |             return f"Failed to add picture: {error_type} - {error_msg or 'No error details available'}\nDiagnostic info: {diagnostic}"
274 |     except Exception as outer_error:
275 |         # Fallback error handling
276 |         error_type = type(outer_error).__name__
277 |         error_msg = str(outer_error)
278 |         return f"Document processing error: {error_type} - {error_msg or 'No error details available'}"
279 | 
280 | 
281 | async def add_page_break(filename: str) -> str:
282 |     """Add a page break to the document.
283 |     
284 |     Args:
285 |         filename: Path to the Word document
286 |     """
287 |     filename = ensure_docx_extension(filename)
288 |     
289 |     if not os.path.exists(filename):
290 |         return f"Document {filename} does not exist"
291 |     
292 |     # Check if file is writeable
293 |     is_writeable, error_message = check_file_writeable(filename)
294 |     if not is_writeable:
295 |         return f"Cannot modify document: {error_message}. Consider creating a copy first."
296 |     
297 |     try:
298 |         doc = Document(filename)
299 |         doc.add_page_break()
300 |         doc.save(filename)
301 |         return f"Page break added to {filename}."
302 |     except Exception as e:
303 |         return f"Failed to add page break: {str(e)}"
304 | 
305 | 
306 | async def add_table_of_contents(filename: str, title: str = "Table of Contents", max_level: int = 3) -> str:
307 |     """Add a table of contents to a Word document based on heading styles.
308 |     
309 |     Args:
310 |         filename: Path to the Word document
311 |         title: Optional title for the table of contents
312 |         max_level: Maximum heading level to include (1-9)
313 |     """
314 |     filename = ensure_docx_extension(filename)
315 |     
316 |     if not os.path.exists(filename):
317 |         return f"Document {filename} does not exist"
318 |     
319 |     # Check if file is writeable
320 |     is_writeable, error_message = check_file_writeable(filename)
321 |     if not is_writeable:
322 |         return f"Cannot modify document: {error_message}. Consider creating a copy first."
323 |     
324 |     try:
325 |         # Ensure max_level is within valid range
326 |         max_level = max(1, min(max_level, 9))
327 |         
328 |         doc = Document(filename)
329 |         
330 |         # Collect headings and their positions
331 |         headings = []
332 |         for i, paragraph in enumerate(doc.paragraphs):
333 |             # Check if paragraph style is a heading
334 |             if paragraph.style and paragraph.style.name.startswith('Heading '):
335 |                 try:
336 |                     # Extract heading level from style name
337 |                     level = int(paragraph.style.name.split(' ')[1])
338 |                     if level <= max_level:
339 |                         headings.append({
340 |                             'level': level,
341 |                             'text': paragraph.text,
342 |                             'position': i
343 |                         })
344 |                 except (ValueError, IndexError):
345 |                     # Skip if heading level can't be determined
346 |                     pass
347 |         
348 |         if not headings:
349 |             return f"No headings found in document {filename}. Table of contents not created."
350 |         
351 |         # Create a new document with the TOC
352 |         toc_doc = Document()
353 |         
354 |         # Add title
355 |         if title:
356 |             toc_doc.add_heading(title, level=1)
357 |         
358 |         # Add TOC entries
359 |         for heading in headings:
360 |             # Indent based on level (using tab characters)
361 |             indent = '    ' * (heading['level'] - 1)
362 |             toc_doc.add_paragraph(f"{indent}{heading['text']}")
363 |         
364 |         # Add page break
365 |         toc_doc.add_page_break()
366 |         
367 |         # Get content from original document
368 |         for paragraph in doc.paragraphs:
369 |             p = toc_doc.add_paragraph(paragraph.text)
370 |             # Copy style if possible
371 |             try:
372 |                 if paragraph.style:
373 |                     p.style = paragraph.style.name
374 |             except:
375 |                 pass
376 |         
377 |         # Copy tables
378 |         for table in doc.tables:
379 |             # Create a new table with the same dimensions
380 |             new_table = toc_doc.add_table(rows=len(table.rows), cols=len(table.columns))
381 |             # Copy cell contents
382 |             for i, row in enumerate(table.rows):
383 |                 for j, cell in enumerate(row.cells):
384 |                     for paragraph in cell.paragraphs:
385 |                         new_table.cell(i, j).text = paragraph.text
386 |         
387 |         # Save the new document with TOC
388 |         toc_doc.save(filename)
389 |         
390 |         return f"Table of contents with {len(headings)} entries added to {filename}"
391 |     except Exception as e:
392 |         return f"Failed to add table of contents: {str(e)}"
393 | 
394 | 
395 | async def delete_paragraph(filename: str, paragraph_index: int) -> str:
396 |     """Delete a paragraph from a document.
397 |     
398 |     Args:
399 |         filename: Path to the Word document
400 |         paragraph_index: Index of the paragraph to delete (0-based)
401 |     """
402 |     filename = ensure_docx_extension(filename)
403 |     
404 |     if not os.path.exists(filename):
405 |         return f"Document {filename} does not exist"
406 |     
407 |     # Check if file is writeable
408 |     is_writeable, error_message = check_file_writeable(filename)
409 |     if not is_writeable:
410 |         return f"Cannot modify document: {error_message}. Consider creating a copy first."
411 |     
412 |     try:
413 |         doc = Document(filename)
414 |         
415 |         # Validate paragraph index
416 |         if paragraph_index < 0 or paragraph_index >= len(doc.paragraphs):
417 |             return f"Invalid paragraph index. Document has {len(doc.paragraphs)} paragraphs (0-{len(doc.paragraphs)-1})."
418 |         
419 |         # Delete the paragraph (by removing its content and setting it empty)
420 |         # Note: python-docx doesn't support true paragraph deletion, this is a workaround
421 |         paragraph = doc.paragraphs[paragraph_index]
422 |         p = paragraph._p
423 |         p.getparent().remove(p)
424 |         
425 |         doc.save(filename)
426 |         return f"Paragraph at index {paragraph_index} deleted successfully."
427 |     except Exception as e:
428 |         return f"Failed to delete paragraph: {str(e)}"
429 | 
430 | 
431 | async def search_and_replace(filename: str, find_text: str, replace_text: str) -> str:
432 |     """Search for text and replace all occurrences.
433 |     
434 |     Args:
435 |         filename: Path to the Word document
436 |         find_text: Text to search for
437 |         replace_text: Text to replace with
438 |     """
439 |     filename = ensure_docx_extension(filename)
440 |     
441 |     if not os.path.exists(filename):
442 |         return f"Document {filename} does not exist"
443 |     
444 |     # Check if file is writeable
445 |     is_writeable, error_message = check_file_writeable(filename)
446 |     if not is_writeable:
447 |         return f"Cannot modify document: {error_message}. Consider creating a copy first."
448 |     
449 |     try:
450 |         doc = Document(filename)
451 |         
452 |         # Perform find and replace
453 |         count = find_and_replace_text(doc, find_text, replace_text)
454 |         
455 |         if count > 0:
456 |             doc.save(filename)
457 |             return f"Replaced {count} occurrence(s) of '{find_text}' with '{replace_text}'."
458 |         else:
459 |             return f"No occurrences of '{find_text}' found."
460 |     except Exception as e:
461 |         return f"Failed to search and replace: {str(e)}"
462 | 
463 | async def insert_header_near_text_tool(filename: str, target_text: str = None, header_title: str = "", position: str = 'after', header_style: str = 'Heading 1', target_paragraph_index: int = None) -> str:
464 |     """Insert a header (with specified style) before or after the target paragraph. Specify by text or paragraph index."""
465 |     return insert_header_near_text(filename, target_text, header_title, position, header_style, target_paragraph_index)
466 | 
467 | async def insert_numbered_list_near_text_tool(filename: str, target_text: str = None, list_items: list = None, position: str = 'after', target_paragraph_index: int = None, bullet_type: str = 'bullet') -> str:
468 |     """Insert a bulleted or numbered list before or after the target paragraph. Specify by text or paragraph index."""
469 |     return insert_numbered_list_near_text(filename, target_text, list_items, position, target_paragraph_index, bullet_type)
470 | 
471 | async def insert_line_or_paragraph_near_text_tool(filename: str, target_text: str = None, line_text: str = "", position: str = 'after', line_style: str = None, target_paragraph_index: int = None) -> str:
472 |     """Insert a new line or paragraph (with specified or matched style) before or after the target paragraph. Specify by text or paragraph index."""
473 |     return insert_line_or_paragraph_near_text(filename, target_text, line_text, position, line_style, target_paragraph_index)
474 | 
475 | async def replace_paragraph_block_below_header_tool(filename: str, header_text: str, new_paragraphs: list, detect_block_end_fn=None) -> str:
476 |     """Reemplaza el bloque de párrafos debajo de un encabezado, evitando modificar TOC."""
477 |     return replace_paragraph_block_below_header(filename, header_text, new_paragraphs, detect_block_end_fn)
478 | 
479 | async def replace_block_between_manual_anchors_tool(filename: str, start_anchor_text: str, new_paragraphs: list, end_anchor_text: str = None, match_fn=None, new_paragraph_style: str = None) -> str:
480 |     """Replace all content between start_anchor_text and end_anchor_text (or next logical header if not provided)."""
481 |     return replace_block_between_manual_anchors(filename, start_anchor_text, new_paragraphs, end_anchor_text, match_fn, new_paragraph_style)
482 | 
```

--------------------------------------------------------------------------------
/setup_mcp.py:
--------------------------------------------------------------------------------

```python
  1 | # Import necessary Python standard libraries
  2 | import os          
  3 | import json        
  4 | import subprocess  
  5 | import sys         
  6 | import shutil     
  7 | import platform
  8 | 
  9 | def check_prerequisites():
 10 |     """
 11 |     Check if necessary prerequisites are installed
 12 |     
 13 |     Returns:
 14 |         tuple: (python_ok, uv_installed, uvx_installed, word_server_installed)
 15 |     """
 16 |     # Check Python version
 17 |     python_version = sys.version_info
 18 |     python_ok = python_version.major >= 3 and python_version.minor >= 8
 19 |     
 20 |     # Check if uv/uvx is installed
 21 |     uv_installed = shutil.which("uv") is not None
 22 |     uvx_installed = shutil.which("uvx") is not None
 23 |     
 24 |     # Check if word-document-server is already installed via pip
 25 |     try:
 26 |         result = subprocess.run(
 27 |             [sys.executable, "-m", "pip", "show", "word-document-server"],
 28 |             capture_output=True,
 29 |             text=True,
 30 |             check=False
 31 |         )
 32 |         word_server_installed = result.returncode == 0
 33 |     except Exception:
 34 |         word_server_installed = False
 35 |         
 36 |     return (python_ok, uv_installed, uvx_installed, word_server_installed)
 37 | 
 38 | def get_transport_choice():
 39 |     """
 40 |     Ask user to choose transport type
 41 |     
 42 |     Returns:
 43 |         dict: Transport configuration
 44 |     """
 45 |     print("\nTransport Configuration:")
 46 |     print("1. STDIO (default, local execution)")
 47 |     print("2. Streamable HTTP (modern, recommended for web deployment)")
 48 |     print("3. SSE (Server-Sent Events, for compatibility)")
 49 |     
 50 |     choice = input("\nSelect transport type (1-3, default: 1): ").strip()
 51 |     
 52 |     if choice == "2":
 53 |         host = input("Host (default: 127.0.0.1): ").strip() or "127.0.0.1"
 54 |         port = input("Port (default: 8000): ").strip() or "8000"
 55 |         path = input("Path (default: /mcp): ").strip() or "/mcp"
 56 |         
 57 |         return {
 58 |             "transport": "streamable-http",
 59 |             "host": host,
 60 |             "port": port,
 61 |             "path": path
 62 |         }
 63 |     elif choice == "3":
 64 |         host = input("Host (default: 127.0.0.1): ").strip() or "127.0.0.1"
 65 |         port = input("Port (default: 8000): ").strip() or "8000"
 66 |         sse_path = input("SSE Path (default: /sse): ").strip() or "/sse"
 67 |         
 68 |         return {
 69 |             "transport": "sse",
 70 |             "host": host,
 71 |             "port": port,
 72 |             "sse_path": sse_path
 73 |         }
 74 |     else:
 75 |         # Default to stdio
 76 |         return {
 77 |             "transport": "stdio"
 78 |         }
 79 | 
 80 | def setup_venv():
 81 |     """
 82 |     Function to set up Python virtual environment
 83 |     
 84 |     Features:
 85 |     - Checks if Python version meets requirements (3.8+)
 86 |     - Creates Python virtual environment (if it doesn't exist)
 87 |     - Installs required dependencies in the newly created virtual environment
 88 |     
 89 |     No parameters required
 90 |     
 91 |     Returns: Path to Python interpreter in the virtual environment
 92 |     """
 93 |     # Check Python version
 94 |     python_version = sys.version_info
 95 |     if python_version.major < 3 or (python_version.major == 3 and python_version.minor < 8):
 96 |         print("Error: Python 3.8 or higher is required.")
 97 |         sys.exit(1)
 98 |     
 99 |     # Get absolute path of the directory containing the current script
100 |     base_path = os.path.abspath(os.path.dirname(__file__))
101 |     # Set virtual environment directory path
102 |     venv_path = os.path.join(base_path, '.venv')
103 |     
104 |     # Determine pip and python executable paths based on operating system
105 |     is_windows = platform.system() == "Windows"
106 |     if is_windows:
107 |         pip_path = os.path.join(venv_path, 'Scripts', 'pip.exe')
108 |         python_path = os.path.join(venv_path, 'Scripts', 'python.exe')
109 |     else:
110 |         pip_path = os.path.join(venv_path, 'bin', 'pip')
111 |         python_path = os.path.join(venv_path, 'bin', 'python')
112 |     
113 |     # Check if virtual environment already exists and is valid
114 |     venv_exists = os.path.exists(venv_path)
115 |     pip_exists = os.path.exists(pip_path)
116 |     
117 |     if not venv_exists or not pip_exists:
118 |         print("Creating new virtual environment...")
119 |         # Remove existing venv if it's invalid
120 |         if venv_exists and not pip_exists:
121 |             print("Existing virtual environment is incomplete, recreating it...")
122 |             try:
123 |                 shutil.rmtree(venv_path)
124 |             except Exception as e:
125 |                 print(f"Warning: Could not remove existing virtual environment: {e}")
126 |                 print("Please delete the .venv directory manually and try again.")
127 |                 sys.exit(1)
128 |         
129 |         # Create virtual environment
130 |         try:
131 |             subprocess.run([sys.executable, '-m', 'venv', venv_path], check=True)
132 |             print("Virtual environment created successfully!")
133 |         except subprocess.CalledProcessError as e:
134 |             print(f"Error creating virtual environment: {e}")
135 |             sys.exit(1)
136 |     else:
137 |         print("Valid virtual environment already exists.")
138 |     
139 |     # Double-check that pip exists after creating venv
140 |     if not os.path.exists(pip_path):
141 |         print(f"Error: pip executable not found at {pip_path}")
142 |         print("Try creating the virtual environment manually with: python -m venv .venv")
143 |         sys.exit(1)
144 |     
145 |     # Install or update dependencies
146 |     print("\nInstalling requirements...")
147 |     try:
148 |         # Install FastMCP package (standalone library)
149 |         subprocess.run([pip_path, 'install', 'fastmcp'], check=True)
150 |         # Install python-docx package
151 |         subprocess.run([pip_path, 'install', 'python-docx'], check=True)
152 |         
153 |         # Also install dependencies from requirements.txt if it exists
154 |         requirements_path = os.path.join(base_path, 'requirements.txt')
155 |         if os.path.exists(requirements_path):
156 |             subprocess.run([pip_path, 'install', '-r', requirements_path], check=True)
157 |         
158 |         print("Requirements installed successfully!")
159 |     except subprocess.CalledProcessError as e:
160 |         print(f"Error installing requirements: {e}")
161 |         sys.exit(1)
162 |     except FileNotFoundError:
163 |         print(f"Error: Could not execute {pip_path}")
164 |         print("Try activating the virtual environment manually and installing requirements:")
165 |         if is_windows:
166 |             print(f".venv\\Scripts\\activate")
167 |         else:
168 |             print("source .venv/bin/activate")
169 |         print("pip install mcp[cli] python-docx")
170 |         sys.exit(1)
171 |     
172 |     return python_path
173 | 
174 | def generate_mcp_config_local(python_path, transport_config):
175 |     """
176 |     Generate MCP configuration for locally installed word-document-server
177 |     
178 |     Parameters:
179 |     - python_path: Path to Python interpreter in the virtual environment
180 |     - transport_config: Transport configuration dictionary
181 |     
182 |     Returns: Path to the generated config file
183 |     """
184 |     # Get absolute path of the directory containing the current script
185 |     base_path = os.path.abspath(os.path.dirname(__file__))
186 |     
187 |     # Path to Word Document Server script
188 |     server_script_path = os.path.join(base_path, 'word_mcp_server.py')
189 |     
190 |     # Build environment variables
191 |     env = {
192 |         "PYTHONPATH": base_path,
193 |         "MCP_TRANSPORT": transport_config["transport"]
194 |     }
195 |     
196 |     # Add transport-specific environment variables
197 |     if transport_config["transport"] == "streamable-http":
198 |         env.update({
199 |             "MCP_HOST": transport_config["host"],
200 |             "MCP_PORT": transport_config["port"],
201 |             "MCP_PATH": transport_config["path"]
202 |         })
203 |     elif transport_config["transport"] == "sse":
204 |         env.update({
205 |             "MCP_HOST": transport_config["host"],
206 |             "MCP_PORT": transport_config["port"],
207 |             "MCP_SSE_PATH": transport_config["sse_path"]
208 |         })
209 |     # For stdio transport, no additional environment variables needed
210 |     
211 |     # Create MCP configuration dictionary
212 |     config = {
213 |         "mcpServers": {
214 |             "word-document-server": {
215 |                 "command": python_path,
216 |                 "args": [server_script_path],
217 |                 "env": env
218 |             }
219 |         }
220 |     }
221 |     
222 |     # Save configuration to JSON file
223 |     config_path = os.path.join(base_path, 'mcp-config.json')
224 |     with open(config_path, 'w') as f:
225 |         json.dump(config, f, indent=2)
226 |     
227 |     return config_path
228 | 
229 | def generate_mcp_config_uvx(transport_config):
230 |     """
231 |     Generate MCP configuration for PyPI-installed word-document-server using UVX
232 |     
233 |     Parameters:
234 |     - transport_config: Transport configuration dictionary
235 |     
236 |     Returns: Path to the generated config file
237 |     """
238 |     # Get absolute path of the directory containing the current script
239 |     base_path = os.path.abspath(os.path.dirname(__file__))
240 |     
241 |     # Build environment variables
242 |     env = {
243 |         "MCP_TRANSPORT": transport_config["transport"]
244 |     }
245 |     
246 |     # Add transport-specific environment variables
247 |     if transport_config["transport"] == "streamable-http":
248 |         env.update({
249 |             "MCP_HOST": transport_config["host"],
250 |             "MCP_PORT": transport_config["port"],
251 |             "MCP_PATH": transport_config["path"]
252 |         })
253 |     elif transport_config["transport"] == "sse":
254 |         env.update({
255 |             "MCP_HOST": transport_config["host"],
256 |             "MCP_PORT": transport_config["port"],
257 |             "MCP_SSE_PATH": transport_config["sse_path"]
258 |         })
259 |     # For stdio transport, no additional environment variables needed
260 |     
261 |     # Create MCP configuration dictionary
262 |     config = {
263 |         "mcpServers": {
264 |             "word-document-server": {
265 |                 "command": "uvx",
266 |                 "args": ["--from", "word-mcp-server", "word_mcp_server"],
267 |                 "env": env
268 |             }
269 |         }
270 |     }
271 |     
272 |     # Save configuration to JSON file
273 |     config_path = os.path.join(base_path, 'mcp-config.json')
274 |     with open(config_path, 'w') as f:
275 |         json.dump(config, f, indent=2)
276 |     
277 |     return config_path
278 | 
279 | def generate_mcp_config_module(transport_config):
280 |     """
281 |     Generate MCP configuration for PyPI-installed word-document-server using Python module
282 |     
283 |     Parameters:
284 |     - transport_config: Transport configuration dictionary
285 |     
286 |     Returns: Path to the generated config file
287 |     """
288 |     # Get absolute path of the directory containing the current script
289 |     base_path = os.path.abspath(os.path.dirname(__file__))
290 |     
291 |     # Build environment variables
292 |     env = {
293 |         "MCP_TRANSPORT": transport_config["transport"]
294 |     }
295 |     
296 |     # Add transport-specific environment variables
297 |     if transport_config["transport"] == "streamable-http":
298 |         env.update({
299 |             "MCP_HOST": transport_config["host"],
300 |             "MCP_PORT": transport_config["port"],
301 |             "MCP_PATH": transport_config["path"]
302 |         })
303 |     elif transport_config["transport"] == "sse":
304 |         env.update({
305 |             "MCP_HOST": transport_config["host"],
306 |             "MCP_PORT": transport_config["port"],
307 |             "MCP_SSE_PATH": transport_config["sse_path"]
308 |         })
309 | 
310 |     
311 |     # Create MCP configuration dictionary
312 |     config = {
313 |         "mcpServers": {
314 |             "word-document-server": {
315 |                 "command": sys.executable,
316 |                 "args": ["-m", "word_document_server"],
317 |                 "env": env
318 |             }
319 |         }
320 |     }
321 |     
322 |     # Save configuration to JSON file
323 |     config_path = os.path.join(base_path, 'mcp-config.json')
324 |     with open(config_path, 'w') as f:
325 |         json.dump(config, f, indent=2)
326 |     
327 |     return config_path
328 | 
329 | def install_from_pypi():
330 |     """
331 |     Install word-document-server from PyPI
332 |     
333 |     Returns: True if successful, False otherwise
334 |     """
335 |     print("\nInstalling word-document-server from PyPI...")
336 |     try:
337 |         subprocess.run([sys.executable, "-m", "pip", "install", "word-mcp-server"], check=True)
338 |         print("word-mcp-server successfully installed from PyPI!")
339 |         return True
340 |     except subprocess.CalledProcessError:
341 |         print("Failed to install word-mcp-server from PyPI.")
342 |         return False
343 | 
344 | def print_config_instructions(config_path, transport_config):
345 |     """
346 |     Print instructions for using the generated config
347 |     
348 |     Parameters:
349 |     - config_path: Path to the generated config file
350 |     - transport_config: Transport configuration dictionary
351 |     """
352 |     print(f"\nMCP configuration has been written to: {config_path}")
353 |     
354 |     with open(config_path, 'r') as f:
355 |         config = json.load(f)
356 |     
357 |     print("\nMCP configuration for Claude Desktop:")
358 |     print(json.dumps(config, indent=2))
359 |     
360 |     # Print transport-specific instructions
361 |     if transport_config["transport"] == "streamable-http":
362 |         print(f"\n📡 Streamable HTTP Transport Configuration:")
363 |         print(f"   Server will be accessible at: http://{transport_config['host']}:{transport_config['port']}{transport_config['path']}")
364 |         print(f"   \n   To test the server manually:")
365 |         print(f"   curl -X POST http://{transport_config['host']}:{transport_config['port']}{transport_config['path']}")
366 |         
367 |     elif transport_config["transport"] == "sse":
368 |         print(f"\n📡 SSE Transport Configuration:")
369 |         print(f"   Server will be accessible at: http://{transport_config['host']}:{transport_config['port']}{transport_config['sse_path']}")
370 |         print(f"   \n   To test the server manually:")
371 |         print(f"   curl http://{transport_config['host']}:{transport_config['port']}{transport_config['sse_path']}")
372 |         
373 |     else:  # stdio
374 |         print(f"\n💻 STDIO Transport Configuration:")
375 |         print(f"   Server runs locally with standard input/output")
376 |     
377 |     # Provide instructions for adding configuration to Claude Desktop configuration file
378 |     if platform.system() == "Windows":
379 |         claude_config_path = os.path.expandvars("%APPDATA%\\Claude\\claude_desktop_config.json")
380 |     else:  # macOS
381 |         claude_config_path = os.path.expanduser("~/Library/Application Support/Claude/claude_desktop_config.json")
382 |     
383 |     print(f"\nTo use with Claude Desktop, merge this configuration into: {claude_config_path}")
384 | 
385 | def create_package_structure():
386 |     """
387 |     Create necessary package structure and environment files
388 |     """
389 |     # Get absolute path of the directory containing the current script
390 |     base_path = os.path.abspath(os.path.dirname(__file__))
391 |     
392 |     # Create __init__.py file
393 |     init_path = os.path.join(base_path, '__init__.py')
394 |     if not os.path.exists(init_path):
395 |         with open(init_path, 'w') as f:
396 |             f.write('# Word Document MCP Server')
397 |         print(f"Created __init__.py at: {init_path}")
398 |     
399 |     # Create requirements.txt file
400 |     requirements_path = os.path.join(base_path, 'requirements.txt')
401 |     if not os.path.exists(requirements_path):
402 |         with open(requirements_path, 'w') as f:
403 |             f.write('fastmcp\npython-docx\nmsoffcrypto-tool\ndocx2pdf\nhttpx\ncryptography\n')
404 |         print(f"Created requirements.txt at: {requirements_path}")
405 |     
406 |     # Create .env.example file
407 |     env_example_path = os.path.join(base_path, '.env.example')
408 |     if not os.path.exists(env_example_path):
409 |         with open(env_example_path, 'w') as f:
410 |             f.write("""# Transport Configuration
411 | # Valid options: stdio, streamable-http, sse
412 | MCP_TRANSPORT=stdio
413 | 
414 | # HTTP/SSE Configuration (when not using stdio)
415 | MCP_HOST=127.0.0.1
416 | MCP_PORT=8000
417 | 
418 | # Streamable HTTP specific
419 | MCP_PATH=/mcp
420 | 
421 | # SSE specific  
422 | MCP_SSE_PATH=/sse
423 | 
424 | """)
425 |         print(f"Created .env.example at: {env_example_path}")
426 | 
427 | # Main execution entry point
428 | if __name__ == '__main__':
429 |     # Check prerequisites
430 |     python_ok, uv_installed, uvx_installed, word_server_installed = check_prerequisites()
431 |     
432 |     if not python_ok:
433 |         print("Error: Python 3.8 or higher is required.")
434 |         sys.exit(1)
435 |     
436 |     print("Word Document MCP Server Setup (Multi-Transport)")
437 |     print("===============================================\n")
438 |     
439 |     # Create necessary files
440 |     create_package_structure()
441 |     
442 |     # Get transport configuration
443 |     transport_config = get_transport_choice()
444 |     
445 |     # If word-document-server is already installed, offer config options
446 |     if word_server_installed:
447 |         print("word-document-server is already installed via pip.")
448 |         
449 |         if uvx_installed:
450 |             print("\nOptions:")
451 |             print("1. Generate MCP config for UVX (recommended)")
452 |             print("2. Generate MCP config for Python module")
453 |             print("3. Set up local development environment")
454 |             
455 |             choice = input("\nEnter your choice (1-3): ")
456 |             
457 |             if choice == "1":
458 |                 config_path = generate_mcp_config_uvx(transport_config)
459 |                 print_config_instructions(config_path, transport_config)
460 |             elif choice == "2":
461 |                 config_path = generate_mcp_config_module(transport_config)
462 |                 print_config_instructions(config_path, transport_config)
463 |             elif choice == "3":
464 |                 python_path = setup_venv()
465 |                 config_path = generate_mcp_config_local(python_path, transport_config)
466 |                 print_config_instructions(config_path, transport_config)
467 |             else:
468 |                 print("Invalid choice. Exiting.")
469 |                 sys.exit(1)
470 |         else:
471 |             print("\nOptions:")
472 |             print("1. Generate MCP config for Python module")
473 |             print("2. Set up local development environment")
474 |             
475 |             choice = input("\nEnter your choice (1-2): ")
476 |             
477 |             if choice == "1":
478 |                 config_path = generate_mcp_config_module(transport_config)
479 |                 print_config_instructions(config_path, transport_config)
480 |             elif choice == "2":
481 |                 python_path = setup_venv()
482 |                 config_path = generate_mcp_config_local(python_path, transport_config)
483 |                 print_config_instructions(config_path, transport_config)
484 |             else:
485 |                 print("Invalid choice. Exiting.")
486 |                 sys.exit(1)
487 |     
488 |     # If word-document-server is not installed, offer installation options
489 |     else:
490 |         print("word-document-server is not installed.")
491 |         
492 |         print("\nOptions:")
493 |         print("1. Install from PyPI (recommended)")
494 |         print("2. Set up local development environment")
495 |         
496 |         choice = input("\nEnter your choice (1-2): ")
497 |         
498 |         if choice == "1":
499 |             if install_from_pypi():
500 |                 if uvx_installed:
501 |                     print("\nNow generating MCP config for UVX...")
502 |                     config_path = generate_mcp_config_uvx(transport_config)
503 |                 else:
504 |                     print("\nUVX not found. Generating MCP config for Python module...")
505 |                     config_path = generate_mcp_config_module(transport_config)
506 |                 print_config_instructions(config_path, transport_config)
507 |         elif choice == "2":
508 |             python_path = setup_venv()
509 |             config_path = generate_mcp_config_local(python_path, transport_config)
510 |             print_config_instructions(config_path, transport_config)
511 |         else:
512 |             print("Invalid choice. Exiting.")
513 |             sys.exit(1)
514 |     
515 |     print("\nSetup complete! You can now use the Word Document MCP server with compatible clients like Claude Desktop.")
516 |     print("\nTransport Summary:")
517 |     print(f"  - Transport: {transport_config['transport']}")
518 |     if transport_config['transport'] != 'stdio':
519 |         print(f"  - Host: {transport_config.get('host', 'N/A')}")
520 |         print(f"  - Port: {transport_config.get('port', 'N/A')}")
521 |         if transport_config['transport'] == 'streamable-http':
522 |             print(f"  - Path: {transport_config.get('path', 'N/A')}")
523 |         elif transport_config['transport'] == 'sse':
524 |             print(f"  - SSE Path: {transport_config.get('sse_path', 'N/A')}")
```

--------------------------------------------------------------------------------
/word_document_server/utils/document_utils.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Document utility functions for Word Document Server.
  3 | """
  4 | import json
  5 | from typing import Dict, List, Any
  6 | from docx import Document
  7 | from docx.oxml.table import CT_Tbl
  8 | from docx.oxml.text.paragraph import CT_P
  9 | from docx.oxml.ns import qn
 10 | from docx.oxml import OxmlElement
 11 | 
 12 | 
 13 | def get_document_properties(doc_path: str) -> Dict[str, Any]:
 14 |     """Get properties of a Word document."""
 15 |     import os
 16 |     if not os.path.exists(doc_path):
 17 |         return {"error": f"Document {doc_path} does not exist"}
 18 |     
 19 |     try:
 20 |         doc = Document(doc_path)
 21 |         core_props = doc.core_properties
 22 |         
 23 |         return {
 24 |             "title": core_props.title or "",
 25 |             "author": core_props.author or "",
 26 |             "subject": core_props.subject or "",
 27 |             "keywords": core_props.keywords or "",
 28 |             "created": str(core_props.created) if core_props.created else "",
 29 |             "modified": str(core_props.modified) if core_props.modified else "",
 30 |             "last_modified_by": core_props.last_modified_by or "",
 31 |             "revision": core_props.revision or 0,
 32 |             "page_count": len(doc.sections),
 33 |             "word_count": sum(len(paragraph.text.split()) for paragraph in doc.paragraphs),
 34 |             "paragraph_count": len(doc.paragraphs),
 35 |             "table_count": len(doc.tables)
 36 |         }
 37 |     except Exception as e:
 38 |         return {"error": f"Failed to get document properties: {str(e)}"}
 39 | 
 40 | 
 41 | def extract_document_text(doc_path: str) -> str:
 42 |     """Extract all text from a Word document."""
 43 |     import os
 44 |     if not os.path.exists(doc_path):
 45 |         return f"Document {doc_path} does not exist"
 46 |     
 47 |     try:
 48 |         doc = Document(doc_path)
 49 |         text = []
 50 |         
 51 |         for paragraph in doc.paragraphs:
 52 |             text.append(paragraph.text)
 53 |             
 54 |         for table in doc.tables:
 55 |             for row in table.rows:
 56 |                 for cell in row.cells:
 57 |                     for paragraph in cell.paragraphs:
 58 |                         text.append(paragraph.text)
 59 |         
 60 |         return "\n".join(text)
 61 |     except Exception as e:
 62 |         return f"Failed to extract text: {str(e)}"
 63 | 
 64 | 
 65 | def get_document_structure(doc_path: str) -> Dict[str, Any]:
 66 |     """Get the structure of a Word document."""
 67 |     import os
 68 |     if not os.path.exists(doc_path):
 69 |         return {"error": f"Document {doc_path} does not exist"}
 70 |     
 71 |     try:
 72 |         doc = Document(doc_path)
 73 |         structure = {
 74 |             "paragraphs": [],
 75 |             "tables": []
 76 |         }
 77 |         
 78 |         # Get paragraphs
 79 |         for i, para in enumerate(doc.paragraphs):
 80 |             structure["paragraphs"].append({
 81 |                 "index": i,
 82 |                 "text": para.text[:100] + ("..." if len(para.text) > 100 else ""),
 83 |                 "style": para.style.name if para.style else "Normal"
 84 |             })
 85 |         
 86 |         # Get tables
 87 |         for i, table in enumerate(doc.tables):
 88 |             table_data = {
 89 |                 "index": i,
 90 |                 "rows": len(table.rows),
 91 |                 "columns": len(table.columns),
 92 |                 "preview": []
 93 |             }
 94 |             
 95 |             # Get sample of table data
 96 |             max_rows = min(3, len(table.rows))
 97 |             for row_idx in range(max_rows):
 98 |                 row_data = []
 99 |                 max_cols = min(3, len(table.columns))
100 |                 for col_idx in range(max_cols):
101 |                     try:
102 |                         cell_text = table.cell(row_idx, col_idx).text
103 |                         row_data.append(cell_text[:20] + ("..." if len(cell_text) > 20 else ""))
104 |                     except IndexError:
105 |                         row_data.append("N/A")
106 |                 table_data["preview"].append(row_data)
107 |             
108 |             structure["tables"].append(table_data)
109 |         
110 |         return structure
111 |     except Exception as e:
112 |         return {"error": f"Failed to get document structure: {str(e)}"}
113 | 
114 | 
115 | def find_paragraph_by_text(doc, text, partial_match=False):
116 |     """
117 |     Find paragraphs containing specific text.
118 |     
119 |     Args:
120 |         doc: Document object
121 |         text: Text to search for
122 |         partial_match: If True, matches paragraphs containing the text; if False, matches exact text
123 |         
124 |     Returns:
125 |         List of paragraph indices that match the criteria
126 |     """
127 |     matching_paragraphs = []
128 |     
129 |     for i, para in enumerate(doc.paragraphs):
130 |         if partial_match and text in para.text:
131 |             matching_paragraphs.append(i)
132 |         elif not partial_match and para.text == text:
133 |             matching_paragraphs.append(i)
134 |             
135 |     return matching_paragraphs
136 | 
137 | 
138 | def find_and_replace_text(doc, old_text, new_text):
139 |     """
140 |     Find and replace text throughout the document, skipping Table of Contents (TOC) paragraphs.
141 |     
142 |     Args:
143 |         doc: Document object
144 |         old_text: Text to find
145 |         new_text: Text to replace with
146 |         
147 |     Returns:
148 |         Number of replacements made
149 |     """
150 |     count = 0
151 |     
152 |     # Search in paragraphs
153 |     for para in doc.paragraphs:
154 |         # Skip TOC paragraphs
155 |         if para.style and para.style.name.startswith("TOC"):
156 |             continue
157 |         if old_text in para.text:
158 |             for run in para.runs:
159 |                 if old_text in run.text:
160 |                     run.text = run.text.replace(old_text, new_text)
161 |                     count += 1
162 |     
163 |     # Search in tables
164 |     for table in doc.tables:
165 |         for row in table.rows:
166 |             for cell in row.cells:
167 |                 for para in cell.paragraphs:
168 |                     # Skip TOC paragraphs in tables
169 |                     if para.style and para.style.name.startswith("TOC"):
170 |                         continue
171 |                     if old_text in para.text:
172 |                         for run in para.runs:
173 |                             if old_text in run.text:
174 |                                 run.text = run.text.replace(old_text, new_text)
175 |                                 count += 1
176 |     
177 |     return count
178 | 
179 | 
180 | def get_document_xml(doc_path: str) -> str:
181 |     """Extract and return the raw XML structure of the Word document (word/document.xml)."""
182 |     import os
183 |     import zipfile
184 |     if not os.path.exists(doc_path):
185 |         return f"Document {doc_path} does not exist"
186 |     try:
187 |         with zipfile.ZipFile(doc_path) as docx_zip:
188 |             with docx_zip.open('word/document.xml') as xml_file:
189 |                 return xml_file.read().decode('utf-8')
190 |     except Exception as e:
191 |         return f"Failed to extract XML: {str(e)}"
192 | 
193 | 
194 | def insert_header_near_text(doc_path: str, target_text: str = None, header_title: str = "", position: str = 'after', header_style: str = 'Heading 1', target_paragraph_index: int = None) -> str:
195 |     """Insert a header (with specified style) before or after the target paragraph. Specify by text or paragraph index. Skips TOC paragraphs in text search."""
196 |     import os
197 |     from docx import Document
198 |     if not os.path.exists(doc_path):
199 |         return f"Document {doc_path} does not exist"
200 |     try:
201 |         doc = Document(doc_path)
202 |         found = False
203 |         para = None
204 |         if target_paragraph_index is not None:
205 |             if target_paragraph_index < 0 or target_paragraph_index >= len(doc.paragraphs):
206 |                 return f"Invalid target_paragraph_index: {target_paragraph_index}. Document has {len(doc.paragraphs)} paragraphs."
207 |             para = doc.paragraphs[target_paragraph_index]
208 |             found = True
209 |         else:
210 |             for i, p in enumerate(doc.paragraphs):
211 |                 # Skip TOC paragraphs
212 |                 if p.style and p.style.name.lower().startswith("toc"):
213 |                     continue
214 |                 if target_text and target_text in p.text:
215 |                     para = p
216 |                     found = True
217 |                     break
218 |         if not found or para is None:
219 |             return f"Target paragraph not found (by index or text). (TOC paragraphs are skipped in text search)"
220 |         # Save anchor index before insertion
221 |         if target_paragraph_index is not None:
222 |             anchor_index = target_paragraph_index
223 |         else:
224 |             anchor_index = None
225 |             for i, p in enumerate(doc.paragraphs):
226 |                 if p is para:
227 |                     anchor_index = i
228 |                     break
229 |         new_para = doc.add_paragraph(header_title, style=header_style)
230 |         if position == 'before':
231 |             para._element.addprevious(new_para._element)
232 |         else:
233 |             para._element.addnext(new_para._element)
234 |         doc.save(doc_path)
235 |         if anchor_index is not None:
236 |             return f"Header '{header_title}' (style: {header_style}) inserted {position} paragraph (index {anchor_index})."
237 |         else:
238 |             return f"Header '{header_title}' (style: {header_style}) inserted {position} the target paragraph."
239 |     except Exception as e:
240 |         return f"Failed to insert header: {str(e)}"
241 | 
242 | 
243 | def insert_line_or_paragraph_near_text(doc_path: str, target_text: str = None, line_text: str = "", position: str = 'after', line_style: str = None, target_paragraph_index: int = None) -> str:
244 |     """
245 |     Insert a new line or paragraph (with specified or matched style) before or after the target paragraph.
246 |     You can specify the target by text (first match) or by paragraph index.
247 |     Skips paragraphs whose style name starts with 'TOC' if using text search.
248 |     """
249 |     import os
250 |     from docx import Document
251 |     if not os.path.exists(doc_path):
252 |         return f"Document {doc_path} does not exist"
253 |     try:
254 |         doc = Document(doc_path)
255 |         found = False
256 |         para = None
257 |         if target_paragraph_index is not None:
258 |             if target_paragraph_index < 0 or target_paragraph_index >= len(doc.paragraphs):
259 |                 return f"Invalid target_paragraph_index: {target_paragraph_index}. Document has {len(doc.paragraphs)} paragraphs."
260 |             para = doc.paragraphs[target_paragraph_index]
261 |             found = True
262 |         else:
263 |             for i, p in enumerate(doc.paragraphs):
264 |                 # Skip TOC paragraphs
265 |                 if p.style and p.style.name.lower().startswith("toc"):
266 |                     continue
267 |                 if target_text and target_text in p.text:
268 |                     para = p
269 |                     found = True
270 |                     break
271 |         if not found or para is None:
272 |             return f"Target paragraph not found (by index or text). (TOC paragraphs are skipped in text search)"
273 |         # Save anchor index before insertion
274 |         if target_paragraph_index is not None:
275 |             anchor_index = target_paragraph_index
276 |         else:
277 |             anchor_index = None
278 |             for i, p in enumerate(doc.paragraphs):
279 |                 if p is para:
280 |                     anchor_index = i
281 |                     break
282 |         # Determine style: use provided or match target
283 |         style = line_style if line_style else para.style
284 |         new_para = doc.add_paragraph(line_text, style=style)
285 |         if position == 'before':
286 |             para._element.addprevious(new_para._element)
287 |         else:
288 |             para._element.addnext(new_para._element)
289 |         doc.save(doc_path)
290 |         if anchor_index is not None:
291 |             return f"Line/paragraph inserted {position} paragraph (index {anchor_index}) with style '{style}'."
292 |         else:
293 |             return f"Line/paragraph inserted {position} the target paragraph with style '{style}'."
294 |     except Exception as e:
295 |         return f"Failed to insert line/paragraph: {str(e)}"
296 | 
297 | 
298 | def add_bullet_numbering(paragraph, num_id=1, level=0):
299 |     """
300 |     Add bullet/numbering XML to a paragraph.
301 | 
302 |     Args:
303 |         paragraph: python-docx Paragraph object
304 |         num_id: Numbering definition ID (1=bullets, 2=numbers, etc.)
305 |         level: Indentation level (0=first level, 1=second level, etc.)
306 | 
307 |     Returns:
308 |         The modified paragraph
309 |     """
310 |     # Get or create paragraph properties
311 |     pPr = paragraph._element.get_or_add_pPr()
312 | 
313 |     # Remove existing numPr if any (to avoid duplicates)
314 |     existing_numPr = pPr.find(qn('w:numPr'))
315 |     if existing_numPr is not None:
316 |         pPr.remove(existing_numPr)
317 | 
318 |     # Create numbering properties element
319 |     numPr = OxmlElement('w:numPr')
320 | 
321 |     # Set indentation level
322 |     ilvl = OxmlElement('w:ilvl')
323 |     ilvl.set(qn('w:val'), str(level))
324 |     numPr.append(ilvl)
325 | 
326 |     # Set numbering definition ID
327 |     numId = OxmlElement('w:numId')
328 |     numId.set(qn('w:val'), str(num_id))
329 |     numPr.append(numId)
330 | 
331 |     # Add to paragraph properties
332 |     pPr.append(numPr)
333 | 
334 |     return paragraph
335 | 
336 | 
337 | def insert_numbered_list_near_text(doc_path: str, target_text: str = None, list_items: list = None, position: str = 'after', target_paragraph_index: int = None, bullet_type: str = 'bullet') -> str:
338 |     """
339 |     Insert a bulleted or numbered list before or after the target paragraph. Specify by text or paragraph index. Skips TOC paragraphs in text search.
340 |     Args:
341 |         doc_path: Path to the Word document
342 |         target_text: Text to search for in paragraphs (optional if using index)
343 |         list_items: List of strings, each as a list item
344 |         position: 'before' or 'after' (default: 'after')
345 |         target_paragraph_index: Optional paragraph index to use as anchor
346 |         bullet_type: 'bullet' for bullets (•), 'number' for numbers (1,2,3) (default: 'bullet')
347 |     Returns:
348 |         Status message
349 |     """
350 |     import os
351 |     from docx import Document
352 |     if not os.path.exists(doc_path):
353 |         return f"Document {doc_path} does not exist"
354 |     try:
355 |         doc = Document(doc_path)
356 |         found = False
357 |         para = None
358 |         if target_paragraph_index is not None:
359 |             if target_paragraph_index < 0 or target_paragraph_index >= len(doc.paragraphs):
360 |                 return f"Invalid target_paragraph_index: {target_paragraph_index}. Document has {len(doc.paragraphs)} paragraphs."
361 |             para = doc.paragraphs[target_paragraph_index]
362 |             found = True
363 |         else:
364 |             for i, p in enumerate(doc.paragraphs):
365 |                 # Skip TOC paragraphs
366 |                 if p.style and p.style.name.lower().startswith("toc"):
367 |                     continue
368 |                 if target_text and target_text in p.text:
369 |                     para = p
370 |                     found = True
371 |                     break
372 |         if not found or para is None:
373 |             return f"Target paragraph not found (by index or text). (TOC paragraphs are skipped in text search)"
374 |         # Save anchor index before insertion
375 |         if target_paragraph_index is not None:
376 |             anchor_index = target_paragraph_index
377 |         else:
378 |             anchor_index = None
379 |             for i, p in enumerate(doc.paragraphs):
380 |                 if p is para:
381 |                     anchor_index = i
382 |                     break
383 |         # Determine numbering ID based on bullet_type
384 |         num_id = 1 if bullet_type == 'bullet' else 2
385 | 
386 |         # Use ListParagraph style for proper list formatting
387 |         style_name = None
388 |         for candidate in ['List Paragraph', 'ListParagraph', 'Normal']:
389 |             try:
390 |                 _ = doc.styles[candidate]
391 |                 style_name = candidate
392 |                 break
393 |             except KeyError:
394 |                 continue
395 |         if not style_name:
396 |             style_name = None  # fallback to default
397 | 
398 |         new_paras = []
399 |         for item in (list_items or []):
400 |             p = doc.add_paragraph(item, style=style_name)
401 |             # Add bullet numbering XML - this is the fix!
402 |             add_bullet_numbering(p, num_id=num_id, level=0)
403 |             new_paras.append(p)
404 |         # Move the new paragraphs to the correct position
405 |         for p in reversed(new_paras):
406 |             if position == 'before':
407 |                 para._element.addprevious(p._element)
408 |             else:
409 |                 para._element.addnext(p._element)
410 |         doc.save(doc_path)
411 |         list_type = "bulleted" if bullet_type == 'bullet' else "numbered"
412 |         if anchor_index is not None:
413 |             return f"{list_type.capitalize()} list with {len(new_paras)} items inserted {position} paragraph (index {anchor_index})."
414 |         else:
415 |             return f"{list_type.capitalize()} list with {len(new_paras)} items inserted {position} the target paragraph."
416 |     except Exception as e:
417 |         return f"Failed to insert numbered list: {str(e)}"
418 | 
419 | 
420 | def is_toc_paragraph(para):
421 |     """Devuelve True si el párrafo tiene un estilo de tabla de contenido (TOC)."""
422 |     return para.style and para.style.name.upper().startswith("TOC")
423 | 
424 | 
425 | def is_heading_paragraph(para):
426 |     """Devuelve True si el párrafo tiene un estilo de encabezado (Heading 1, Heading 2, etc)."""
427 |     return para.style and para.style.name.lower().startswith("heading")
428 | 
429 | 
430 | # --- Helper: Get style name from a <w:p> element ---
431 | def get_paragraph_style(el):
432 |     from docx.oxml.ns import qn
433 |     pPr = el.find(qn('w:pPr'))
434 |     if pPr is not None:
435 |         pStyle = pPr.find(qn('w:pStyle'))
436 |         if pStyle is not None and 'w:val' in pStyle.attrib:
437 |             return pStyle.attrib['w:val']
438 |     return None
439 | 
440 | # --- Main: Delete everything under a header until next heading/TOC ---
441 | def delete_block_under_header(doc, header_text):
442 |     """
443 |     Remove all elements (paragraphs, tables, etc.) after the header (by text) and before the next heading/TOC (by style).
444 |     Returns: (header_element, elements_removed)
445 |     """
446 |     # Find the header paragraph by text (like delete_paragraph finds by index)
447 |     header_para = None
448 |     header_idx = None
449 |     
450 |     for i, para in enumerate(doc.paragraphs):
451 |         if para.text.strip().lower() == header_text.strip().lower():
452 |             header_para = para
453 |             header_idx = i
454 |             break
455 |     
456 |     if header_para is None:
457 |         return None, 0
458 |     
459 |     # Find the next heading/TOC paragraph to determine the end of the block
460 |     end_idx = None
461 |     for i in range(header_idx + 1, len(doc.paragraphs)):
462 |         para = doc.paragraphs[i]
463 |         if para.style and para.style.name.lower().startswith(('heading', 'título', 'toc')):
464 |             end_idx = i
465 |             break
466 |     
467 |     # If no next heading found, delete until end of document
468 |     if end_idx is None:
469 |         end_idx = len(doc.paragraphs)
470 |     
471 |     # Remove paragraphs by index (like delete_paragraph does)
472 |     removed_count = 0
473 |     for i in range(header_idx + 1, end_idx):
474 |         if i < len(doc.paragraphs):  # Safety check
475 |             para = doc.paragraphs[header_idx + 1]  # Always remove the first paragraph after header
476 |             p = para._p
477 |             p.getparent().remove(p)
478 |             removed_count += 1
479 |     
480 |     return header_para._p, removed_count
481 | 
482 | # --- Usage in replace_paragraph_block_below_header ---
483 | def replace_paragraph_block_below_header(
484 |     doc_path: str,
485 |     header_text: str,
486 |     new_paragraphs: list,
487 |     detect_block_end_fn=None,
488 |     new_paragraph_style: str = None
489 | ) -> str:
490 |     """
491 |     Reemplaza todo el contenido debajo de una cabecera (por texto), hasta el siguiente encabezado/TOC (por estilo).
492 |     """
493 |     from docx import Document
494 |     import os
495 |     if not os.path.exists(doc_path):
496 |         return f"Document {doc_path} not found."
497 |     
498 |     doc = Document(doc_path)
499 |     
500 |     # Find the header paragraph first
501 |     header_para = None
502 |     header_idx = None
503 |     for i, para in enumerate(doc.paragraphs):
504 |         para_text = para.text.strip().lower()
505 |         is_toc = is_toc_paragraph(para)
506 |         if para_text == header_text.strip().lower() and not is_toc:
507 |             header_para = para
508 |             header_idx = i
509 |             break
510 |     
511 |     if header_para is None:
512 |         return f"Header '{header_text}' not found in document."
513 |     
514 |     # Delete everything under the header using the same document instance
515 |     header_el, removed_count = delete_block_under_header(doc, header_text)
516 |     
517 |     # Now insert new paragraphs after the header (which should still be in the document)
518 |     style_to_use = new_paragraph_style or "Normal"
519 |     
520 |     # Find the header again after deletion (it should still be there)
521 |     current_para = header_para
522 |     for text in new_paragraphs:
523 |         new_para = doc.add_paragraph(text, style=style_to_use)
524 |         current_para._element.addnext(new_para._element)
525 |         current_para = new_para
526 |     
527 |     doc.save(doc_path)
528 |     return f"Replaced content under '{header_text}' with {len(new_paragraphs)} paragraph(s), style: {style_to_use}, removed {removed_count} elements."
529 | 
530 | 
531 | def replace_block_between_manual_anchors(
532 |     doc_path: str,
533 |     start_anchor_text: str,
534 |     new_paragraphs: list,
535 |     end_anchor_text: str = None,
536 |     match_fn=None,
537 |     new_paragraph_style: str = None
538 | ) -> str:
539 |     """
540 |     Replace all content (paragraphs, tables, etc.) between start_anchor_text and end_anchor_text (or next logical header if not provided).
541 |     If end_anchor_text is None, deletes until next visually distinct paragraph (bold, all caps, or different font size), or end of document.
542 |     Inserts new_paragraphs after the start anchor.
543 |     """
544 |     from docx import Document
545 |     import os
546 |     if not os.path.exists(doc_path):
547 |         return f"Document {doc_path} not found."
548 |     doc = Document(doc_path)
549 |     body = doc.element.body
550 |     elements = list(body)
551 |     start_idx = None
552 |     end_idx = None
553 |     # Find start anchor
554 |     for i, el in enumerate(elements):
555 |         if el.tag == CT_P.tag:
556 |             p_text = "".join([node.text or '' for node in el.iter() if node.tag.endswith('}t')]).strip()
557 |             if match_fn:
558 |                 if match_fn(p_text, el):
559 |                     start_idx = i
560 |                     break
561 |             elif p_text == start_anchor_text.strip():
562 |                 start_idx = i
563 |                 break
564 |     if start_idx is None:
565 |         return f"Start anchor '{start_anchor_text}' not found."
566 |     # Find end anchor
567 |     if end_anchor_text:
568 |         for i in range(start_idx + 1, len(elements)):
569 |             el = elements[i]
570 |             if el.tag == CT_P.tag:
571 |                 p_text = "".join([node.text or '' for node in el.iter() if node.tag.endswith('}t')]).strip()
572 |                 if match_fn:
573 |                     if match_fn(p_text, el, is_end=True):
574 |                         end_idx = i
575 |                         break
576 |                 elif p_text == end_anchor_text.strip():
577 |                     end_idx = i
578 |                     break
579 |     else:
580 |         # Heuristic: next visually distinct paragraph (bold, all caps, or different font size), or end of document
581 |         for i in range(start_idx + 1, len(elements)):
582 |             el = elements[i]
583 |             if el.tag == CT_P.tag:
584 |                 # Check for bold, all caps, or font size
585 |                 runs = [node for node in el.iter() if node.tag.endswith('}r')]
586 |                 for run in runs:
587 |                     rpr = run.find(qn('w:rPr'))
588 |                     if rpr is not None:
589 |                         if rpr.find(qn('w:b')) is not None or rpr.find(qn('w:caps')) is not None or rpr.find(qn('w:sz')) is not None:
590 |                             end_idx = i
591 |                             break
592 |                 if end_idx is not None:
593 |                     break
594 |     # Mark elements for removal
595 |     to_remove = []
596 |     for i in range(start_idx + 1, end_idx if end_idx is not None else len(elements)):
597 |         to_remove.append(elements[i])
598 |     for el in to_remove:
599 |         body.remove(el)
600 |     doc.save(doc_path)
601 |     # Reload and find start anchor for insertion
602 |     doc = Document(doc_path)
603 |     paras = doc.paragraphs
604 |     anchor_idx = None
605 |     for i, para in enumerate(paras):
606 |         if para.text.strip() == start_anchor_text.strip():
607 |             anchor_idx = i
608 |             break
609 |     if anchor_idx is None:
610 |         return f"Start anchor '{start_anchor_text}' not found after deletion (unexpected)."
611 |     anchor_para = paras[anchor_idx]
612 |     style_to_use = new_paragraph_style or "Normal"
613 |     for text in new_paragraphs:
614 |         new_para = doc.add_paragraph(text, style=style_to_use)
615 |         anchor_para._element.addnext(new_para._element)
616 |         anchor_para = new_para
617 |     doc.save(doc_path)
618 |     return f"Replaced content between '{start_anchor_text}' and '{end_anchor_text or 'next logical header'}' with {len(new_paragraphs)} paragraph(s), style: {style_to_use}, removed {len(to_remove)} elements."
619 | 
```
Page 1/2FirstPrevNextLast