Compare commits

..

61 Commits

Author SHA1 Message Date
yyhuni
236c828041 chore(fingerprints): remove deprecated ARL fingerprint rules
- Remove obsolete fingerprint detection rules from ARL.yaml
- Clean up legacy device and service signatures that are no longer maintained
- Reduce fingerprint database size by eliminating unused detection patterns
- Improve maintainability by removing outdated vendor-specific rules
2026-01-01 22:45:08 +08:00
yyhuni
fb13bb74d8 feat(filter): add array fuzzy search support with PostgreSQL array_to_string
- Add ArrayToString custom PostgreSQL function for converting arrays to delimited strings
- Implement array field annotation in QueryBuilder to support fuzzy matching on JSON array fields
- Enhance _build_single_q to handle three operators for JSON arrays: exact match (==), negation (!=), and fuzzy search (=)
- Update target navigation routes from subdomain to website view for consistency
- Enable fuzzy search on array fields by converting them to text during query building
2026-01-01 22:41:57 +08:00
yyhuni
f076c682b6 feat(scan): add multi-engine support and config merging with enhanced indexing
- Add multi-engine support to Scan model with engine_ids and engine_names fields
- Implement config_merger utility for merging multiple engine configurations
- Add merged_configuration property to Scan model for unified config access
- Update scan creation and scheduling services to handle multiple engines
- Add pg_trgm GIN indexes to asset and snapshot models for fuzzy search on url, title, and name fields
- Update scan views and serializers to support multi-engine selection and display
- Enhance frontend components for multi-engine scan initiation and scheduling
- Update test data generation script for multi-engine scan scenarios
- Add internationalization strings for multi-engine UI elements
- Refactor scan flow to use merged configuration instead of single engine config
- Update Docker compose files with latest configuration
2026-01-01 22:35:05 +08:00
yyhuni
9eda2caceb feat(asset): add response headers and body tracking with pg_trgm indexing
- Rename body_preview to response_body across endpoint and website models for consistency
- Change response_headers from Dict to string type for efficient text indexing
- Add pg_trgm PostgreSQL extension initialization in AssetConfig for GIN index support
- Update all DTOs to reflect response_body and response_headers field changes
- Modify repositories to handle new response_body and response_headers formats
- Update serializers and views to work with string-based response headers
- Add response_headers and response_body columns to frontend endpoint and website tables
- Update command templates and scan tasks to populate response_body and response_headers
- Add database initialization script for pg_trgm extension in PostgreSQL setup
- Update frontend types and translations for new field names
- Enable efficient full-text search on response headers and body content through GIN indexes
2026-01-01 19:34:11 +08:00
yyhuni
b1c9e202dd feat(sidebar): add feedback link to secondary navigation menu
- Import IconMessageReport icon from tabler/icons-react for feedback menu item
- Add feedback navigation item linking to GitHub issues page
- Add "feedback" translation key to English messages (en.json)
- Add "feedback" translation key to Chinese messages (zh.json) as "反馈建议"
- Improves user engagement by providing direct access to issue reporting
2026-01-01 18:31:34 +08:00
yyhuni
918669bc29 style(ui): update expandable cell whitespace handling for better formatting
- Change whitespace class from `whitespace-normal` to `whitespace-pre-wrap` in expandable cell component
- Improves text rendering by preserving whitespace and line breaks in cell content
- Ensures consistent formatting display across different content types (mono, url, muted variants)
2026-01-01 16:41:47 +08:00
yyhuni
fd70b0544d docs(frontend): update Chinese translations to English for consistency
- Change "响应头" to "Response Headers" in endpoint messages
- Change "响应头" to "Response Headers" in website messages
- Maintain consistency across frontend message translations
- Improve clarity for international users by standardizing field labels
2026-01-01 16:23:03 +08:00
github-actions[bot]
0f2df7a5f3 chore: bump version to v1.2.14-dev 2026-01-01 05:13:25 +00:00
yyhuni
857ab737b5 feat(fingerprint): enhance xingfinger task with snapshot tracking and field merging
- Replace `not_found_count` with `created_count` and `snapshot_count` metrics in fingerprint detect flow
- Initialize and aggregate `snapshot_count` across tool statistics
- Refactor `parse_xingfinger_line()` to return structured dict with url, techs, server, title, status_code, and content_length
- Replace `bulk_merge_tech_field()` with `bulk_merge_website_fields()` to support merging multiple WebSite fields
- Implement smart merge strategy: arrays deduplicated, scalar fields only updated when empty/NULL
- Remove dynamic model loading via importlib in favor of direct WebSite model import
- Add WebsiteSnapshotDTO and DjangoWebsiteSnapshotRepository imports for snapshot handling
- Improve xingfinger output parsing to capture server, title, and HTTP metadata alongside technology detection
2026-01-01 12:40:49 +08:00
yyhuni
ee2d99edda feat(asset): add response headers tracking to endpoints and websites
- Add response_headers field to Endpoint and WebSite models as JSONField
- Add response_headers field to EndpointSnapshot and WebsiteSnapshot models
- Update all related DTOs to include response_headers with Dict[str, Any] type
- Add GIN indexes on response_headers fields for optimized JSON queries
- Update endpoint and website repositories to handle response_headers data
- Update serializers to include response_headers in API responses
- Update frontend components to display response headers in detail views
- Add response_headers to fingerprint detection and site scan tasks
- Update command templates and engine config to support header extraction
- Add internationalization strings for response headers in en.json and zh.json
- Update TypeScript types for endpoint and website to include response_headers
- Enhance scan history and target detail pages to show response header information
2026-01-01 12:25:22 +08:00
github-actions[bot]
db6ce16aca chore: bump version to v1.2.13-dev 2026-01-01 02:24:08 +00:00
yyhuni
ab800eca06 feat(frontend): reorder navigation tabs for improved UX
- Move "Websites" tab to first position in scan history and target layouts
- Reposition "IP Addresses" tab before "Ports" for better logical flow
- Maintain consistent tab ordering across both scan history and target pages
- Improve navigation hierarchy by placing primary discovery results first
2026-01-01 09:47:30 +08:00
yyhuni
e8e5572339 perf(asset): add GIN indexes for tech array fields and improve query parser
- Add GinIndex for tech array field in Endpoint model to optimize __contains queries
- Add GinIndex for tech array field in WebSite model to optimize __contains queries
- Import GinIndex from django.contrib.postgres.indexes
- Refactor QueryParser to protect quoted filter values during tokenization
- Implement placeholder-based filter extraction to preserve spaces within quoted values
- Replace filter tokens with placeholders before logical operator normalization
- Restore original filter conditions from placeholders during parsing
- Fix spacing in comments for consistency (add space after "从")
- Improves query performance for technology stack filtering on large datasets
2026-01-01 08:58:03 +08:00
github-actions[bot]
d48d4bbcad chore: bump version to v1.2.12-dev 2025-12-31 16:01:48 +00:00
yyhuni
d1cca4c083 base timeout set 10s 2025-12-31 23:27:02 +08:00
yyhuni
df0810c863 feat: add fingerprint recognition feature and update documentation
- Add fingerprint recognition section to README with support for 2.7W+ rules from multiple sources (EHole, Goby, Wappalyzer, Fingers, FingerPrintHub, ARL)
- Update scanning pipeline architecture diagram to include fingerprint recognition stage between site identification and deep analysis
- Add fingerprint recognition styling to mermaid diagram for visual consistency
- Include WORKER_API_KEY environment variable in task distributor for worker authentication
- Update WeChat QR code image and public account name from "洋洋的小黑屋" to "塔罗安全学苑"
- Fix import statements in nav-system.tsx to use i18n navigation utilities instead of next/link and next/navigation
- Enhance scanning workflow documentation to reflect complete pipeline: subdomain discovery → port scanning → site identification → fingerprint recognition → URL collection → directory scanning → vulnerability scanning
2025-12-31 23:09:25 +08:00
yyhuni
d33e54c440 docs: simplify quick-start guide
- Remove alternative ZIP download method, keep only Git clone approach
- Remove update.sh script reference from service management section
- Remove dedicated "定期更新" (periodic updates) section
- Streamline documentation to focus on primary installation and usage paths
2025-12-31 22:50:08 +08:00
yyhuni
35a306fe8b fix:dev环境 2025-12-31 22:46:42 +08:00
yyhuni
724df82931 chore: pin Docker base image digests and add worker API key generation
- Pin golang:1.24 base image to specific digest to prevent upstream cache invalidation
- Pin ubuntu:24.04 base image to specific digest to prevent upstream cache invalidation
- Add WORKER_API_KEY generation in install.sh auto_fill_docker_env_secrets function
- Generate random 32-character string for WORKER_API_KEY during installation
- Update installation info message to include WORKER_API_KEY in generated secrets list
- Improve build reproducibility and security by using immutable image references
2025-12-31 22:40:38 +08:00
yyhuni
8dfffdf802 fix:认证 2025-12-31 22:21:40 +08:00
github-actions[bot]
b8cb85ce0b chore: bump version to v1.2.9-dev 2025-12-31 13:48:44 +00:00
yyhuni
da96d437a4 增加授权认证 2025-12-31 20:18:34 +08:00
github-actions[bot]
feaf8062e5 chore: bump version to v1.2.8-dev 2025-12-31 11:33:14 +00:00
yyhuni
4bab76f233 fix:组织删除问题 2025-12-31 17:50:37 +08:00
yyhuni
09416b4615 fix:redis端口 2025-12-31 17:45:25 +08:00
github-actions[bot]
bc1c5f6b0e chore: bump version to v1.2.7-dev 2025-12-31 06:16:42 +00:00
github-actions[bot]
2f2742e6fe chore: bump version to v1.2.6-dev 2025-12-31 05:29:36 +00:00
yyhuni
be3c346a74 增加搜索字段 2025-12-31 12:40:21 +08:00
yyhuni
0c7a6fff12 增加tech字段的搜索 2025-12-31 12:37:02 +08:00
yyhuni
3b4f0e3147 fix:指纹识别 2025-12-31 12:30:31 +08:00
yyhuni
51212a2a0c fix:指纹识别 2025-12-31 12:17:23 +08:00
yyhuni
58533bbaf6 fix:docker api 2025-12-31 12:03:08 +08:00
github-actions[bot]
6ccca1602d chore: bump version to v1.2.5-dev 2025-12-31 03:48:32 +00:00
yyhuni
6389b0f672 feat(fingerprints): Add type annotation to getAcceptConfig function
- Add explicit return type annotation `Record<string, string[]>` to getAcceptConfig function
- Improve type safety and IDE autocomplete for file type configuration
- Enhance code clarity for accepted file types mapping in import dialog
2025-12-31 10:17:25 +08:00
yyhuni
d7599b8599 feat(fingerprints): Add database indexes and expand test data generation
- Add database indexes on 'link' field in FingersFingerprint model for improved query performance
- Add database index on 'author' field in FingerPrintHubFingerprint model for filtering optimization
- Expand test data generation to include Fingers, FingerPrintHub, and ARL fingerprint types
- Add comprehensive fingerprint data generation methods with realistic templates and patterns
- Update test data cleanup to include all fingerprint table types
- Add i18n translations for fingerprint-related UI components and labels
- Optimize route prefetching hook for better performance
- Improve fingerprint data table columns and vulnerability columns display consistencyzxc
2025-12-31 10:04:15 +08:00
yyhuni
8eff298293 更新镜像加速逻辑 2025-12-31 08:56:55 +08:00
yyhuni
3634101c5b 添加灯塔等指纹 2025-12-31 08:55:37 +08:00
yyhuni
163973a7df feat(i18n): Add internationalization support to dropzone component
- Add useTranslations hook to DropzoneContent component for multi-language support
- Add useTranslations hook to DropzoneEmptyState component for multi-language support
- Replace hardcoded English strings with i18n translation keys in dropzone UI
- Add comprehensive translation keys for dropzone messages in en.json:
* uploadFile, uploadFiles, dragOrClick, dragOrClickReplace
* moreFiles, supports, minimum, maximum, sizeBetween
- Add corresponding Chinese translations in zh.json for all dropzone messages
- Support dynamic content in translations using parameterized keys (files count, size ranges)
- Ensure consistent user experience across English and Chinese interfaces
2025-12-30 21:19:37 +08:00
yyhuni
80ffecba3e feat(i18n): Add UI component i18n provider and standardize translation keys
- Add UiI18nProvider component to wrap UI library translations globally
- Integrate UiI18nProvider into root layout for consistent i18n support
- Standardize download action translation keys (allEndpoints → all, selectedEndpoints → selected)
- Update ExpandableTagList component prop from maxVisible to maxLines for better layout control
- Fix color scheme in dashboard stop scan button (chart-2 → primary)
- Add DOCKER_API_VERSION configuration to backend settings for Docker client compatibility
- Update task distributor to use configurable Docker API version (default 1.40)
- Add environment variable support for Docker API version in task execution commands
- Update i18n configuration and message files with standardized keys
- Ensure UI components respect application locale settings across all data tables and dialogs
2025-12-30 21:19:28 +08:00
yyhuni
3c21ac940c 恢复ssh docker 2025-12-30 20:35:51 +08:00
yyhuni
5c9f484d70 fix(frontend): Fix i18n translation key references and add missing labels
- Change "nav" translation namespace to "navigation" in scan engine and wordlists pages
- Replace parameterized translation calls with raw translation strings for cron schedule options in scheduled scan page and dashboard component
- Cast raw translation results to string type for proper TypeScript typing
- Add missing "name" and "type" labels to fingerprint section in English and Chinese message files
- Ensure consistent translation key usage across components for better maintainability
2025-12-30 18:21:16 +08:00
yyhuni
7567f6c25b 更新文字描述 2025-12-30 18:08:39 +08:00
yyhuni
0599a0b298 ansi-to-html加入 2025-12-30 18:01:29 +08:00
yyhuni
f7557fe90c ansi-to-html替代log显示 2025-12-30 18:01:22 +08:00
yyhuni
13571b9772 fix(frontend): Fix xterm SSR initialization error
- Add 100ms delay for terminal initialization to ensure DOM is mounted
- Use requestAnimationFrame for fit() to avoid dimensions error
- Add try-catch for all xterm operations
- Proper cleanup on unmount

Fixes: Cannot read properties of undefined (reading 'dimensions')
2025-12-30 17:41:38 +08:00
yyhuni
8ee76eef69 feat(frontend): Add ANSI color support for system logs
- Create AnsiLogViewer component using xterm.js
- Replace Monaco Editor with xterm for log viewing
- Native ANSI escape code rendering (colors, bold, etc.)
- Auto-scroll to bottom, clickable URLs support

Benefits:
- Colorized logs for better readability
- No more escape codes like [32m[0m in UI
- Professional terminal-like experience
2025-12-30 17:39:12 +08:00
yyhuni
2a31e29aa2 fix: Add shell quoting for command arguments
- Use shlex.quote() to escape special characters in argument values
- Fixes: 'unrecognized arguments' error when values contain spaces
- Example: target_name='example.com scan' now correctly quoted
2025-12-30 17:32:09 +08:00
yyhuni
81abc59961 Refactor: Migrate TaskDistributor to Docker SDK
- Replace CLI subprocess with Python Docker SDK
- Add DockerClientManager for unified container management
- Remove 300+ lines of shell command building code
- Enable future features: container status monitoring, log streaming

Breaking changes: None (backward compatible with existing scans)
Rollback: git reset --hard v1.0-before-docker-sdk
2025-12-30 17:23:18 +08:00
yyhuni
ffbfec6dd5 feat(stage2): Refactor TaskDistributor to use Docker SDK
- Replace CLI subprocess calls with DockerClientManager.run_container()
- Add helper methods: _build_container_command, _build_container_environment, _build_container_volumes
- Refactor execute_scan_flow() and execute_cleanup_on_all_workers() to use SDK
- Remove old CLI methods: _build_docker_command, _execute_docker_command, _execute_local_docker, _execute_ssh_docker
- Remove paramiko import (no longer needed for local workers)

Benefits:
- 300+ lines removed (CLI string building complexity)
- Type-safe container configuration (no more shlex.quote errors)
- Structured error handling (ImageNotFound, APIError)
- Ready for container status monitoring and log streaming
2025-12-30 17:20:26 +08:00
yyhuni
a0091636a8 feat(stage1): Add DockerClientManager
- Create docker_client_manager.py with local Docker client support
- Add container lifecycle management (run, status, logs, stop, remove)
- Implement structured error handling (ImageNotFound, APIError)
- Add client connection caching and reuse
- Set Docker API version to 1.40 (compatible with Docker 19.03+)
- Add dependencies: docker>=6.0.0, packaging>=21.0

TODO: Remote worker support (Docker Context or SSH tunnel)
2025-12-30 17:17:17 +08:00
yyhuni
69490ab396 feat: Add DockerClientManager for unified Docker client management
- Create docker_client_manager.py with local Docker client support
- Add container lifecycle management (run, status, logs, stop, remove)
- Implement structured error handling (ImageNotFound, APIError)
- Add client connection caching and reuse
- Set Docker API version to 1.40 (compatible with Docker 19.03+)
- Add docker>=6.0.0 and packaging>=21.0 dependencies

TODO: Remote worker support (Docker Context or SSH tunnel)
2025-12-30 17:15:29 +08:00
yyhuni
7306964abf 更新readme 2025-12-30 16:44:08 +08:00
yyhuni
cb6b0259e3 fix:响应不匹配 2025-12-30 16:40:17 +08:00
yyhuni
e1b4618e58 refactor(worker): isolate scan tools to dedicated directory
- Move scan tools base path from `/usr/local/bin` to `/opt/xingrin-tools/bin` to avoid conflicts with system tools and Python packages
- Create dedicated `/opt/xingrin-tools/bin` directory in worker Dockerfile following FHS standards
- Update PATH environment variable to prioritize project-specific tools directory
- Add `SCAN_TOOLS_PATH` environment variable to `.env.example` with documentation
- Update settings.py to use new default path with explanatory comments
- Fix TypeScript type annotation in system-logs-view.tsx for better maintainability
- Remove frontend package-lock.json to reduce repository size
- Update task distributor comment to reflect new tool location
This change improves tool isolation and prevents naming conflicts while maintaining FHS compliance.
2025-12-30 11:42:09 +08:00
yyhuni
556dcf5f62 重构日志ui功能 2025-12-30 11:13:38 +08:00
yyhuni
0628eef025 重构响应为标准响应格式 2025-12-30 10:56:26 +08:00
yyhuni
38ed8bc642 fix(scan): improve config parser validation and enable subdomain resolve timeout
- Uncomment timeout: auto setting in subdomain discovery config example
- Add validation to reject None or non-dict configuration values
- Raise ValueError with descriptive message when config is None
- Raise ValueError when config is not a dictionary type
- Update docstring to document Raises section for error conditions
- Prevent silent failures from malformed YAML configurations
2025-12-30 08:54:02 +08:00
yyhuni
2f4d6a2168 统一工具挂载为/usr/local/bin 2025-12-30 08:45:36 +08:00
yyhuni
c25cb9e06b fix:工具挂载 2025-12-30 08:39:17 +08:00
yyhuni
b14ab71c7f fix:auth frontend 2025-12-30 08:12:04 +08:00
github-actions[bot]
8b5060e2d3 chore: bump version to v1.2.2-dev 2025-12-29 17:08:05 +00:00
214 changed files with 18612 additions and 11717 deletions

View File

@@ -62,9 +62,14 @@
- **自定义流程** - YAML 配置扫描流程,灵活编排
- **定时扫描** - Cron 表达式配置,自动化周期扫描
### 🔖 指纹识别
- **多源指纹库** - 内置 EHole、Goby、Wappalyzer、Fingers、FingerPrintHub、ARL 等 2.7W+ 指纹规则
- **自动识别** - 扫描流程自动执行,识别 Web 应用技术栈
- **指纹管理** - 支持查询、导入、导出指纹规则
#### 扫描流程架构
完整的扫描流程包括子域名发现、端口扫描、站点发现、URL 收集、目录扫描、漏洞扫描等阶段
完整的扫描流程包括:子域名发现、端口扫描、站点发现、指纹识别、URL 收集、目录扫描、漏洞扫描等阶段
```mermaid
flowchart LR
@@ -75,7 +80,8 @@ flowchart LR
SUB["子域名发现<br/>subfinder, amass, puredns"]
PORT["端口扫描<br/>naabu"]
SITE["站点识别<br/>httpx"]
SUB --> PORT --> SITE
FINGER["指纹识别<br/>xingfinger"]
SUB --> PORT --> SITE --> FINGER
end
subgraph STAGE2["阶段 2: 深度分析"]
@@ -91,7 +97,7 @@ flowchart LR
FINISH["扫描完成"]
START --> STAGE1
SITE --> STAGE2
FINGER --> STAGE2
STAGE2 --> STAGE3
STAGE3 --> FINISH
@@ -103,6 +109,7 @@ flowchart LR
style SUB fill:#5dade2,stroke:#3498db,stroke-width:1px,color:#fff
style PORT fill:#5dade2,stroke:#3498db,stroke-width:1px,color:#fff
style SITE fill:#5dade2,stroke:#3498db,stroke-width:1px,color:#fff
style FINGER fill:#5dade2,stroke:#3498db,stroke-width:1px,color:#fff
style URL fill:#bb8fce,stroke:#9b59b6,stroke-width:1px,color:#fff
style DIR fill:#bb8fce,stroke:#9b59b6,stroke-width:1px,color:#fff
style VULN fill:#f0b27a,stroke:#e67e22,stroke-width:1px,color:#fff
@@ -178,7 +185,7 @@ cd xingrin
# 安装并启动(生产模式)
sudo ./install.sh
# 🇨🇳 中国大陆用户推荐使用镜像加速
# 🇨🇳 中国大陆用户推荐使用镜像加速(第三方加速服务可能会失效,不保证长期可用)
sudo ./install.sh --mirror
```
@@ -211,13 +218,12 @@ sudo ./uninstall.sh
- 🐛 **如果发现 Bug** 可以点击右边链接进行提交 [Issue](https://github.com/yyhuni/xingrin/issues)
- 💡 **有新想法比如UI设计功能设计等** 欢迎点击右边链接进行提交建议 [Issue](https://github.com/yyhuni/xingrin/issues)
- 🔧 **想参与开发?** 关注我公众号与我个人联系
## 📧 联系
- 目前版本就我个人使用,可能会有很多边界问题
- 如有问题,建议,其他,优先提交[Issue](https://github.com/yyhuni/xingrin/issues),也可以直接给我的公众号发消息,我都会回复的
- 微信公众号: **洋洋的小黑屋**
- 微信公众号: **塔罗安全学苑**
<img src="docs/wechat-qrcode.png" alt="微信公众号" width="200">

View File

@@ -1 +1 @@
v1.1.14
v1.2.14-dev

View File

@@ -1,5 +1,9 @@
import logging
from django.apps import AppConfig
logger = logging.getLogger(__name__)
class AssetConfig(AppConfig):
default_auto_field = 'django.db.models.BigAutoField'
@@ -8,3 +12,34 @@ class AssetConfig(AppConfig):
def ready(self):
# 导入所有模型以确保Django发现并注册
from . import models
# 启用 pg_trgm 扩展(用于文本模糊搜索索引)
# 用于已有数据库升级场景
self._ensure_pg_trgm_extension()
def _ensure_pg_trgm_extension(self):
"""
确保 pg_trgm 扩展已启用。
该扩展用于 response_body 和 response_headers 字段的 GIN 索引,
支持高效的文本模糊搜索。
"""
from django.db import connection
# 检查是否为 PostgreSQL 数据库
if connection.vendor != 'postgresql':
logger.debug("跳过 pg_trgm 扩展:当前数据库不是 PostgreSQL")
return
try:
with connection.cursor() as cursor:
cursor.execute("CREATE EXTENSION IF NOT EXISTS pg_trgm;")
logger.debug("pg_trgm 扩展已启用")
except Exception as e:
# 记录错误但不阻止应用启动
# 常见原因:权限不足(需要超级用户权限)
logger.warning(
"无法创建 pg_trgm 扩展: %s"
"这可能导致 response_body 和 response_headers 字段的 GIN 索引无法正常工作。"
"请手动执行: CREATE EXTENSION IF NOT EXISTS pg_trgm;",
str(e)
)

View File

@@ -14,12 +14,13 @@ class EndpointDTO:
status_code: Optional[int] = None
content_length: Optional[int] = None
webserver: Optional[str] = None
body_preview: Optional[str] = None
response_body: Optional[str] = None
content_type: Optional[str] = None
tech: Optional[List[str]] = None
vhost: Optional[bool] = None
location: Optional[str] = None
matched_gf_patterns: Optional[List[str]] = None
response_headers: Optional[str] = None
def __post_init__(self):
if self.tech is None:

View File

@@ -17,9 +17,10 @@ class WebSiteDTO:
webserver: str = ''
content_type: str = ''
tech: List[str] = None
body_preview: str = ''
response_body: str = ''
vhost: Optional[bool] = None
created_at: str = None
response_headers: str = ''
def __post_init__(self):
if self.tech is None:

View File

@@ -22,10 +22,11 @@ class EndpointSnapshotDTO:
webserver: str = ''
content_type: str = ''
tech: List[str] = None
body_preview: str = ''
response_body: str = ''
vhost: Optional[bool] = None
matched_gf_patterns: List[str] = None
target_id: Optional[int] = None # 冗余字段,用于同步到资产表
response_headers: str = ''
def __post_init__(self):
if self.tech is None:
@@ -53,10 +54,11 @@ class EndpointSnapshotDTO:
status_code=self.status_code,
content_length=self.content_length,
webserver=self.webserver,
body_preview=self.body_preview,
response_body=self.response_body,
content_type=self.content_type,
tech=self.tech if self.tech else [],
vhost=self.vhost,
location=self.location,
matched_gf_patterns=self.matched_gf_patterns if self.matched_gf_patterns else []
matched_gf_patterns=self.matched_gf_patterns if self.matched_gf_patterns else [],
response_headers=self.response_headers,
)

View File

@@ -23,8 +23,9 @@ class WebsiteSnapshotDTO:
web_server: str = ''
content_type: str = ''
tech: List[str] = None
body_preview: str = ''
response_body: str = ''
vhost: Optional[bool] = None
response_headers: str = ''
def __post_init__(self):
if self.tech is None:
@@ -50,6 +51,7 @@ class WebsiteSnapshotDTO:
webserver=self.web_server,
content_type=self.content_type,
tech=self.tech if self.tech else [],
body_preview=self.body_preview,
vhost=self.vhost
response_body=self.response_body,
vhost=self.vhost,
response_headers=self.response_headers,
)

View File

@@ -1,6 +1,7 @@
from django.db import models
from django.contrib.postgres.fields import ArrayField
from django.contrib.postgres.indexes import GinIndex
from django.core.validators import MinValueValidator, MaxValueValidator
@@ -34,6 +35,12 @@ class Subdomain(models.Model):
models.Index(fields=['name', 'target']), # 复合索引,优化 get_by_names_and_target_id 批量查询
models.Index(fields=['target']), # 优化从target_id快速查找下面的子域名
models.Index(fields=['name']), # 优化从name快速查找子域名搜索场景
# pg_trgm GIN 索引,支持 LIKE '%keyword%' 模糊搜索
GinIndex(
name='subdomain_name_trgm_idx',
fields=['name'],
opclasses=['gin_trgm_ops']
),
]
constraints = [
# 普通唯一约束name + target 组合唯一
@@ -84,11 +91,10 @@ class Endpoint(models.Model):
default='',
help_text='服务器类型HTTP 响应头 Server 值)'
)
body_preview = models.CharField(
max_length=1000,
response_body = models.TextField(
blank=True,
default='',
help_text='响应正文前N个字符默认100个字符'
help_text='HTTP响应体'
)
content_type = models.CharField(
max_length=200,
@@ -123,6 +129,11 @@ class Endpoint(models.Model):
default=list,
help_text='匹配的GF模式列表用于识别敏感端点如api, debug, config等'
)
response_headers = models.TextField(
blank=True,
default='',
help_text='原始HTTP响应头'
)
class Meta:
db_table = 'endpoint'
@@ -131,11 +142,28 @@ class Endpoint(models.Model):
ordering = ['-created_at']
indexes = [
models.Index(fields=['-created_at']),
models.Index(fields=['target']), # 优化从target_id快速查找下面的端点主关联字段
models.Index(fields=['target']), # 优化从 target_id快速查找下面的端点主关联字段
models.Index(fields=['url']), # URL索引优化查询性能
models.Index(fields=['host']), # host索引优化根据主机名查询
models.Index(fields=['status_code']), # 状态码索引,优化筛选
models.Index(fields=['title']), # title索引优化智能过滤搜索
GinIndex(fields=['tech']), # GIN索引优化 tech 数组字段的 __contains 查询
# pg_trgm GIN 索引,支持 LIKE '%keyword%' 模糊搜索
GinIndex(
name='endpoint_resp_headers_trgm_idx',
fields=['response_headers'],
opclasses=['gin_trgm_ops']
),
GinIndex(
name='endpoint_url_trgm_idx',
fields=['url'],
opclasses=['gin_trgm_ops']
),
GinIndex(
name='endpoint_title_trgm_idx',
fields=['title'],
opclasses=['gin_trgm_ops']
),
]
constraints = [
# 普通唯一约束url + target 组合唯一
@@ -186,11 +214,10 @@ class WebSite(models.Model):
default='',
help_text='服务器类型HTTP 响应头 Server 值)'
)
body_preview = models.CharField(
max_length=1000,
response_body = models.TextField(
blank=True,
default='',
help_text='响应正文前N个字符默认100个字符'
help_text='HTTP响应体'
)
content_type = models.CharField(
max_length=200,
@@ -219,6 +246,11 @@ class WebSite(models.Model):
blank=True,
help_text='是否支持虚拟主机'
)
response_headers = models.TextField(
blank=True,
default='',
help_text='原始HTTP响应头'
)
class Meta:
db_table = 'website'
@@ -229,9 +261,26 @@ class WebSite(models.Model):
models.Index(fields=['-created_at']),
models.Index(fields=['url']), # URL索引优化查询性能
models.Index(fields=['host']), # host索引优化根据主机名查询
models.Index(fields=['target']), # 优化从target_id快速查找下面的站点
models.Index(fields=['target']), # 优化从 target_id快速查找下面的站点
models.Index(fields=['title']), # title索引优化智能过滤搜索
models.Index(fields=['status_code']), # 状态码索引,优化智能过滤搜索
GinIndex(fields=['tech']), # GIN索引优化 tech 数组字段的 __contains 查询
# pg_trgm GIN 索引,支持 LIKE '%keyword%' 模糊搜索
GinIndex(
name='website_resp_headers_trgm_idx',
fields=['response_headers'],
opclasses=['gin_trgm_ops']
),
GinIndex(
name='website_url_trgm_idx',
fields=['url'],
opclasses=['gin_trgm_ops']
),
GinIndex(
name='website_title_trgm_idx',
fields=['title'],
opclasses=['gin_trgm_ops']
),
]
constraints = [
# 普通唯一约束url + target 组合唯一
@@ -308,6 +357,12 @@ class Directory(models.Model):
models.Index(fields=['target']), # 优化从target_id快速查找下面的目录
models.Index(fields=['url']), # URL索引优化搜索和唯一约束
models.Index(fields=['status']), # 状态码索引,优化筛选
# pg_trgm GIN 索引,支持 LIKE '%keyword%' 模糊搜索
GinIndex(
name='directory_url_trgm_idx',
fields=['url'],
opclasses=['gin_trgm_ops']
),
]
constraints = [
# 普通唯一约束target + url 组合唯一

View File

@@ -1,5 +1,6 @@
from django.db import models
from django.contrib.postgres.fields import ArrayField
from django.contrib.postgres.indexes import GinIndex
from django.core.validators import MinValueValidator, MaxValueValidator
@@ -26,6 +27,12 @@ class SubdomainSnapshot(models.Model):
models.Index(fields=['scan']),
models.Index(fields=['name']),
models.Index(fields=['-created_at']),
# pg_trgm GIN 索引,支持 LIKE '%keyword%' 模糊搜索
GinIndex(
name='subdomain_snap_name_trgm',
fields=['name'],
opclasses=['gin_trgm_ops']
),
]
constraints = [
# 唯一约束:同一次扫描中,同一个子域名只能记录一次
@@ -68,8 +75,13 @@ class WebsiteSnapshot(models.Model):
default=list,
help_text='技术栈'
)
body_preview = models.TextField(blank=True, default='', help_text='响应体预览')
response_body = models.TextField(blank=True, default='', help_text='HTTP响应体')
vhost = models.BooleanField(null=True, blank=True, help_text='虚拟主机标志')
response_headers = models.TextField(
blank=True,
default='',
help_text='原始HTTP响应头'
)
created_at = models.DateTimeField(auto_now_add=True, help_text='创建时间')
class Meta:
@@ -83,6 +95,23 @@ class WebsiteSnapshot(models.Model):
models.Index(fields=['host']), # host索引优化根据主机名查询
models.Index(fields=['title']), # title索引优化标题搜索
models.Index(fields=['-created_at']),
GinIndex(fields=['tech']), # GIN索引优化数组字段查询
# pg_trgm GIN 索引,支持 LIKE '%keyword%' 模糊搜索
GinIndex(
name='ws_snap_resp_hdr_trgm',
fields=['response_headers'],
opclasses=['gin_trgm_ops']
),
GinIndex(
name='ws_snap_url_trgm',
fields=['url'],
opclasses=['gin_trgm_ops']
),
GinIndex(
name='ws_snap_title_trgm',
fields=['title'],
opclasses=['gin_trgm_ops']
),
]
constraints = [
# 唯一约束同一次扫描中同一个URL只能记录一次
@@ -132,6 +161,12 @@ class DirectorySnapshot(models.Model):
models.Index(fields=['status']), # 状态码索引,优化筛选
models.Index(fields=['content_type']), # content_type索引优化内容类型搜索
models.Index(fields=['-created_at']),
# pg_trgm GIN 索引,支持 LIKE '%keyword%' 模糊搜索
GinIndex(
name='dir_snap_url_trgm',
fields=['url'],
opclasses=['gin_trgm_ops']
),
]
constraints = [
# 唯一约束同一次扫描中同一个目录URL只能记录一次
@@ -251,7 +286,7 @@ class EndpointSnapshot(models.Model):
default=list,
help_text='技术栈'
)
body_preview = models.CharField(max_length=1000, blank=True, default='', help_text='响应体预览')
response_body = models.TextField(blank=True, default='', help_text='HTTP响应体')
vhost = models.BooleanField(null=True, blank=True, help_text='虚拟主机标志')
matched_gf_patterns = ArrayField(
models.CharField(max_length=100),
@@ -259,6 +294,11 @@ class EndpointSnapshot(models.Model):
default=list,
help_text='匹配的GF模式列表'
)
response_headers = models.TextField(
blank=True,
default='',
help_text='原始HTTP响应头'
)
created_at = models.DateTimeField(auto_now_add=True, help_text='创建时间')
class Meta:
@@ -274,6 +314,23 @@ class EndpointSnapshot(models.Model):
models.Index(fields=['status_code']), # 状态码索引,优化筛选
models.Index(fields=['webserver']), # webserver索引优化服务器搜索
models.Index(fields=['-created_at']),
GinIndex(fields=['tech']), # GIN索引优化数组字段查询
# pg_trgm GIN 索引,支持 LIKE '%keyword%' 模糊搜索
GinIndex(
name='ep_snap_resp_hdr_trgm',
fields=['response_headers'],
opclasses=['gin_trgm_ops']
),
GinIndex(
name='ep_snap_url_trgm',
fields=['url'],
opclasses=['gin_trgm_ops']
),
GinIndex(
name='ep_snap_title_trgm',
fields=['title'],
opclasses=['gin_trgm_ops']
),
]
constraints = [
# 唯一约束同一次扫描中同一个URL只能记录一次

View File

@@ -48,12 +48,13 @@ class DjangoEndpointRepository:
status_code=item.status_code,
content_length=item.content_length,
webserver=item.webserver or '',
body_preview=item.body_preview or '',
response_body=item.response_body or '',
content_type=item.content_type or '',
tech=item.tech if item.tech else [],
vhost=item.vhost,
location=item.location or '',
matched_gf_patterns=item.matched_gf_patterns if item.matched_gf_patterns else []
matched_gf_patterns=item.matched_gf_patterns if item.matched_gf_patterns else [],
response_headers=item.response_headers if item.response_headers else ''
)
for item in unique_items
]
@@ -65,8 +66,8 @@ class DjangoEndpointRepository:
unique_fields=['url', 'target'],
update_fields=[
'host', 'title', 'status_code', 'content_length',
'webserver', 'body_preview', 'content_type', 'tech',
'vhost', 'location', 'matched_gf_patterns'
'webserver', 'response_body', 'content_type', 'tech',
'vhost', 'location', 'matched_gf_patterns', 'response_headers'
],
batch_size=1000
)
@@ -138,12 +139,13 @@ class DjangoEndpointRepository:
status_code=item.status_code,
content_length=item.content_length,
webserver=item.webserver or '',
body_preview=item.body_preview or '',
response_body=item.response_body or '',
content_type=item.content_type or '',
tech=item.tech if item.tech else [],
vhost=item.vhost,
location=item.location or '',
matched_gf_patterns=item.matched_gf_patterns if item.matched_gf_patterns else []
matched_gf_patterns=item.matched_gf_patterns if item.matched_gf_patterns else [],
response_headers=item.response_headers if item.response_headers else ''
)
for item in unique_items
]
@@ -183,7 +185,7 @@ class DjangoEndpointRepository:
.values(
'url', 'host', 'location', 'title', 'status_code',
'content_length', 'content_type', 'webserver', 'tech',
'body_preview', 'vhost', 'matched_gf_patterns', 'created_at'
'response_body', 'response_headers', 'vhost', 'matched_gf_patterns', 'created_at'
)
.order_by('url')
)

View File

@@ -49,12 +49,13 @@ class DjangoWebSiteRepository:
location=item.location or '',
title=item.title or '',
webserver=item.webserver or '',
body_preview=item.body_preview or '',
response_body=item.response_body or '',
content_type=item.content_type or '',
tech=item.tech if item.tech else [],
status_code=item.status_code,
content_length=item.content_length,
vhost=item.vhost
vhost=item.vhost,
response_headers=item.response_headers if item.response_headers else ''
)
for item in unique_items
]
@@ -66,8 +67,8 @@ class DjangoWebSiteRepository:
unique_fields=['url', 'target'],
update_fields=[
'host', 'location', 'title', 'webserver',
'body_preview', 'content_type', 'tech',
'status_code', 'content_length', 'vhost'
'response_body', 'content_type', 'tech',
'status_code', 'content_length', 'vhost', 'response_headers'
],
batch_size=1000
)
@@ -132,12 +133,13 @@ class DjangoWebSiteRepository:
location=item.location or '',
title=item.title or '',
webserver=item.webserver or '',
body_preview=item.body_preview or '',
response_body=item.response_body or '',
content_type=item.content_type or '',
tech=item.tech if item.tech else [],
status_code=item.status_code,
content_length=item.content_length,
vhost=item.vhost
vhost=item.vhost,
response_headers=item.response_headers if item.response_headers else ''
)
for item in unique_items
]
@@ -177,7 +179,7 @@ class DjangoWebSiteRepository:
.values(
'url', 'host', 'location', 'title', 'status_code',
'content_length', 'content_type', 'webserver', 'tech',
'body_preview', 'vhost', 'created_at'
'response_body', 'response_headers', 'vhost', 'created_at'
)
.order_by('url')
)

View File

@@ -44,6 +44,7 @@ class DjangoEndpointSnapshotRepository:
snapshots.append(EndpointSnapshot(
scan_id=item.scan_id,
url=item.url,
host=item.host if item.host else '',
title=item.title,
status_code=item.status_code,
content_length=item.content_length,
@@ -51,9 +52,10 @@ class DjangoEndpointSnapshotRepository:
webserver=item.webserver,
content_type=item.content_type,
tech=item.tech if item.tech else [],
body_preview=item.body_preview,
response_body=item.response_body,
vhost=item.vhost,
matched_gf_patterns=item.matched_gf_patterns if item.matched_gf_patterns else []
matched_gf_patterns=item.matched_gf_patterns if item.matched_gf_patterns else [],
response_headers=item.response_headers if item.response_headers else ''
))
# 批量创建(忽略冲突,基于唯一约束去重)
@@ -100,7 +102,7 @@ class DjangoEndpointSnapshotRepository:
.values(
'url', 'host', 'location', 'title', 'status_code',
'content_length', 'content_type', 'webserver', 'tech',
'body_preview', 'vhost', 'matched_gf_patterns', 'created_at'
'response_body', 'response_headers', 'vhost', 'matched_gf_patterns', 'created_at'
)
.order_by('url')
)

View File

@@ -52,8 +52,9 @@ class DjangoWebsiteSnapshotRepository:
web_server=item.web_server,
content_type=item.content_type,
tech=item.tech if item.tech else [],
body_preview=item.body_preview,
vhost=item.vhost
response_body=item.response_body,
vhost=item.vhost,
response_headers=item.response_headers if item.response_headers else ''
))
# 批量创建(忽略冲突,基于唯一约束去重)
@@ -100,7 +101,7 @@ class DjangoWebsiteSnapshotRepository:
.values(
'url', 'host', 'location', 'title', 'status',
'content_length', 'content_type', 'web_server', 'tech',
'body_preview', 'vhost', 'created_at'
'response_body', 'response_headers', 'vhost', 'created_at'
)
.order_by('url')
)
@@ -117,7 +118,8 @@ class DjangoWebsiteSnapshotRepository:
'content_type': row['content_type'],
'webserver': row['web_server'],
'tech': row['tech'],
'body_preview': row['body_preview'],
'response_body': row['response_body'],
'response_headers': row['response_headers'],
'vhost': row['vhost'],
'created_at': row['created_at'],
}

View File

@@ -67,9 +67,10 @@ class SubdomainListSerializer(serializers.ModelSerializer):
class WebSiteSerializer(serializers.ModelSerializer):
"""站点序列化器"""
"""站点序列化器(目标详情页)"""
subdomain = serializers.CharField(source='subdomain.name', allow_blank=True, default='')
responseHeaders = serializers.CharField(source='response_headers', read_only=True) # 原始HTTP响应头
class Meta:
model = WebSite
@@ -83,9 +84,10 @@ class WebSiteSerializer(serializers.ModelSerializer):
'content_type',
'status_code',
'content_length',
'body_preview',
'response_body',
'tech',
'vhost',
'responseHeaders', # HTTP响应头
'subdomain',
'created_at',
]
@@ -140,6 +142,7 @@ class EndpointListSerializer(serializers.ModelSerializer):
source='matched_gf_patterns',
read_only=True,
)
responseHeaders = serializers.CharField(source='response_headers', read_only=True) # 原始HTTP响应头
class Meta:
model = Endpoint
@@ -152,9 +155,10 @@ class EndpointListSerializer(serializers.ModelSerializer):
'content_length',
'content_type',
'webserver',
'body_preview',
'response_body',
'tech',
'vhost',
'responseHeaders', # HTTP响应头
'gfPatterns',
'created_at',
]
@@ -215,6 +219,7 @@ class WebsiteSnapshotSerializer(serializers.ModelSerializer):
subdomain_name = serializers.CharField(source='subdomain.name', read_only=True)
webserver = serializers.CharField(source='web_server', read_only=True) # 映射字段名
status_code = serializers.IntegerField(source='status', read_only=True) # 映射字段名
responseHeaders = serializers.CharField(source='response_headers', read_only=True) # 原始HTTP响应头
class Meta:
model = WebsiteSnapshot
@@ -227,9 +232,10 @@ class WebsiteSnapshotSerializer(serializers.ModelSerializer):
'content_type',
'status_code', # 使用映射后的字段名
'content_length',
'body_preview',
'response_body',
'tech',
'vhost',
'responseHeaders', # HTTP响应头
'subdomain_name',
'created_at',
]
@@ -264,6 +270,7 @@ class EndpointSnapshotSerializer(serializers.ModelSerializer):
source='matched_gf_patterns',
read_only=True,
)
responseHeaders = serializers.CharField(source='response_headers', read_only=True) # 原始HTTP响应头
class Meta:
model = EndpointSnapshot
@@ -277,9 +284,10 @@ class EndpointSnapshotSerializer(serializers.ModelSerializer):
'content_type',
'status_code',
'content_length',
'body_preview',
'response_body',
'tech',
'vhost',
'responseHeaders', # HTTP响应头
'gfPatterns',
'created_at',
]

View File

@@ -28,6 +28,7 @@ class EndpointService:
'host': 'host',
'title': 'title',
'status': 'status_code',
'tech': 'tech',
}
def __init__(self):
@@ -115,7 +116,7 @@ class EndpointService:
"""获取目标下的所有端点"""
queryset = self.repo.get_by_target(target_id)
if filter_query:
queryset = apply_filters(queryset, filter_query, self.FILTER_FIELD_MAPPING)
queryset = apply_filters(queryset, filter_query, self.FILTER_FIELD_MAPPING, json_array_fields=['tech'])
return queryset
def count_endpoints_by_target(self, target_id: int) -> int:
@@ -134,7 +135,7 @@ class EndpointService:
"""获取所有端点(全局查询)"""
queryset = self.repo.get_all()
if filter_query:
queryset = apply_filters(queryset, filter_query, self.FILTER_FIELD_MAPPING)
queryset = apply_filters(queryset, filter_query, self.FILTER_FIELD_MAPPING, json_array_fields=['tech'])
return queryset
def iter_endpoint_urls_by_target(self, target_id: int, chunk_size: int = 1000) -> Iterator[str]:

View File

@@ -20,6 +20,7 @@ class WebSiteService:
'host': 'host',
'title': 'title',
'status': 'status_code',
'tech': 'tech',
}
def __init__(self, repository=None):
@@ -107,14 +108,14 @@ class WebSiteService:
"""获取目标下的所有网站"""
queryset = self.repo.get_by_target(target_id)
if filter_query:
queryset = apply_filters(queryset, filter_query, self.FILTER_FIELD_MAPPING)
queryset = apply_filters(queryset, filter_query, self.FILTER_FIELD_MAPPING, json_array_fields=['tech'])
return queryset
def get_all(self, filter_query: Optional[str] = None):
"""获取所有网站"""
queryset = self.repo.get_all()
if filter_query:
queryset = apply_filters(queryset, filter_query, self.FILTER_FIELD_MAPPING)
queryset = apply_filters(queryset, filter_query, self.FILTER_FIELD_MAPPING, json_array_fields=['tech'])
return queryset
def get_by_url(self, url: str, target_id: int) -> int:

View File

@@ -2,6 +2,8 @@ import logging
from rest_framework import viewsets, status, filters
from rest_framework.decorators import action
from rest_framework.response import Response
from apps.common.response_helpers import success_response, error_response
from apps.common.error_codes import ErrorCodes
from rest_framework.request import Request
from rest_framework.exceptions import NotFound, ValidationError as DRFValidationError
from django.core.exceptions import ValidationError, ObjectDoesNotExist
@@ -57,7 +59,7 @@ class AssetStatisticsViewSet(viewsets.ViewSet):
"""
try:
stats = self.service.get_statistics()
return Response({
return success_response(data={
'totalTargets': stats['total_targets'],
'totalSubdomains': stats['total_subdomains'],
'totalIps': stats['total_ips'],
@@ -80,9 +82,10 @@ class AssetStatisticsViewSet(viewsets.ViewSet):
})
except (DatabaseError, OperationalError) as e:
logger.exception("获取资产统计数据失败")
return Response(
{'error': '获取统计数据失败'},
status=status.HTTP_500_INTERNAL_SERVER_ERROR
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Failed to get statistics',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
@action(detail=False, methods=['get'], url_path='history')
@@ -107,12 +110,13 @@ class AssetStatisticsViewSet(viewsets.ViewSet):
days = min(max(days, 1), 90) # 限制在 1-90 天
history = self.service.get_statistics_history(days=days)
return Response(history)
return success_response(data=history)
except (DatabaseError, OperationalError) as e:
logger.exception("获取统计历史数据失败")
return Response(
{'error': '获取历史数据失败'},
status=status.HTTP_500_INTERNAL_SERVER_ERROR
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Failed to get history data',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
@@ -164,45 +168,50 @@ class SubdomainViewSet(viewsets.ModelViewSet):
响应:
{
"message": "批量创建完成",
"createdCount": 10,
"skippedCount": 2,
"invalidCount": 1,
"mismatchedCount": 1,
"totalReceived": 14
"data": {
"createdCount": 10,
"skippedCount": 2,
"invalidCount": 1,
"mismatchedCount": 1,
"totalReceived": 14
}
}
"""
from apps.targets.models import Target
target_pk = self.kwargs.get('target_pk')
if not target_pk:
return Response(
{'error': '必须在目标下批量创建子域名'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Must create subdomains under a target',
status_code=status.HTTP_400_BAD_REQUEST
)
# 获取目标
try:
target = Target.objects.get(pk=target_pk)
except Target.DoesNotExist:
return Response(
{'error': '目标不存在'},
status=status.HTTP_404_NOT_FOUND
return error_response(
code=ErrorCodes.NOT_FOUND,
message='Target not found',
status_code=status.HTTP_404_NOT_FOUND
)
# 验证目标类型必须为域名
if target.type != Target.TargetType.DOMAIN:
return Response(
{'error': '只有域名类型的目标支持导入子域名'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Only domain type targets support subdomain import',
status_code=status.HTTP_400_BAD_REQUEST
)
# 获取请求体中的子域名列表
subdomains = request.data.get('subdomains', [])
if not subdomains or not isinstance(subdomains, list):
return Response(
{'error': '请求体不能为空或格式错误'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Request body cannot be empty or invalid format',
status_code=status.HTTP_400_BAD_REQUEST
)
# 调用 service 层处理
@@ -214,19 +223,19 @@ class SubdomainViewSet(viewsets.ModelViewSet):
)
except Exception as e:
logger.exception("批量创建子域名失败")
return Response(
{'error': '服务器内部错误'},
status=status.HTTP_500_INTERNAL_SERVER_ERROR
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Server internal error',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
return Response({
'message': '批量创建完成',
return success_response(data={
'createdCount': result.created_count,
'skippedCount': result.skipped_count,
'invalidCount': result.invalid_count,
'mismatchedCount': result.mismatched_count,
'totalReceived': result.total_received,
}, status=status.HTTP_200_OK)
})
@action(detail=False, methods=['get'], url_path='export')
def export(self, request, **kwargs):
@@ -265,6 +274,7 @@ class WebSiteViewSet(viewsets.ModelViewSet):
- host="example" 主机名模糊匹配
- title="login" 标题模糊匹配
- status="200,301" 状态码多值匹配
- tech="nginx" 技术栈匹配(数组字段)
- 多条件空格分隔 AND 关系
"""
@@ -299,35 +309,38 @@ class WebSiteViewSet(viewsets.ModelViewSet):
响应:
{
"message": "批量创建完成",
"createdCount": 10,
"mismatchedCount": 2
"data": {
"createdCount": 10
}
}
"""
from apps.targets.models import Target
target_pk = self.kwargs.get('target_pk')
if not target_pk:
return Response(
{'error': '必须在目标下批量创建网站'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Must create websites under a target',
status_code=status.HTTP_400_BAD_REQUEST
)
# 获取目标
try:
target = Target.objects.get(pk=target_pk)
except Target.DoesNotExist:
return Response(
{'error': '目标不存在'},
status=status.HTTP_404_NOT_FOUND
return error_response(
code=ErrorCodes.NOT_FOUND,
message='Target not found',
status_code=status.HTTP_404_NOT_FOUND
)
# 获取请求体中的 URL 列表
urls = request.data.get('urls', [])
if not urls or not isinstance(urls, list):
return Response(
{'error': '请求体不能为空或格式错误'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Request body cannot be empty or invalid format',
status_code=status.HTTP_400_BAD_REQUEST
)
# 调用 service 层处理
@@ -340,21 +353,21 @@ class WebSiteViewSet(viewsets.ModelViewSet):
)
except Exception as e:
logger.exception("批量创建网站失败")
return Response(
{'error': '服务器内部错误'},
status=status.HTTP_500_INTERNAL_SERVER_ERROR
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Server internal error',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
return Response({
'message': '批量创建完成',
return success_response(data={
'createdCount': created_count,
}, status=status.HTTP_200_OK)
})
@action(detail=False, methods=['get'], url_path='export')
def export(self, request, **kwargs):
"""导出网站为 CSV 格式
CSV 列url, host, location, title, status_code, content_length, content_type, webserver, tech, body_preview, vhost, created_at
CSV 列url, host, location, title, status_code, content_length, content_type, webserver, tech, response_body, response_headers, vhost, created_at
"""
from apps.common.utils import generate_csv_rows, format_datetime, format_list_field
@@ -367,7 +380,7 @@ class WebSiteViewSet(viewsets.ModelViewSet):
headers = [
'url', 'host', 'location', 'title', 'status_code',
'content_length', 'content_type', 'webserver', 'tech',
'body_preview', 'vhost', 'created_at'
'response_body', 'response_headers', 'vhost', 'created_at'
]
formatters = {
'created_at': format_datetime,
@@ -426,35 +439,38 @@ class DirectoryViewSet(viewsets.ModelViewSet):
响应:
{
"message": "批量创建完成",
"createdCount": 10,
"mismatchedCount": 2
"data": {
"createdCount": 10
}
}
"""
from apps.targets.models import Target
target_pk = self.kwargs.get('target_pk')
if not target_pk:
return Response(
{'error': '必须在目标下批量创建目录'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Must create directories under a target',
status_code=status.HTTP_400_BAD_REQUEST
)
# 获取目标
try:
target = Target.objects.get(pk=target_pk)
except Target.DoesNotExist:
return Response(
{'error': '目标不存在'},
status=status.HTTP_404_NOT_FOUND
return error_response(
code=ErrorCodes.NOT_FOUND,
message='Target not found',
status_code=status.HTTP_404_NOT_FOUND
)
# 获取请求体中的 URL 列表
urls = request.data.get('urls', [])
if not urls or not isinstance(urls, list):
return Response(
{'error': '请求体不能为空或格式错误'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Request body cannot be empty or invalid format',
status_code=status.HTTP_400_BAD_REQUEST
)
# 调用 service 层处理
@@ -467,15 +483,15 @@ class DirectoryViewSet(viewsets.ModelViewSet):
)
except Exception as e:
logger.exception("批量创建目录失败")
return Response(
{'error': '服务器内部错误'},
status=status.HTTP_500_INTERNAL_SERVER_ERROR
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Server internal error',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
return Response({
'message': '批量创建完成',
return success_response(data={
'createdCount': created_count,
}, status=status.HTTP_200_OK)
})
@action(detail=False, methods=['get'], url_path='export')
def export(self, request, **kwargs):
@@ -519,6 +535,7 @@ class EndpointViewSet(viewsets.ModelViewSet):
- host="example" 主机名模糊匹配
- title="login" 标题模糊匹配
- status="200,301" 状态码多值匹配
- tech="nginx" 技术栈匹配(数组字段)
- 多条件空格分隔 AND 关系
"""
@@ -553,35 +570,38 @@ class EndpointViewSet(viewsets.ModelViewSet):
响应:
{
"message": "批量创建完成",
"createdCount": 10,
"mismatchedCount": 2
"data": {
"createdCount": 10
}
}
"""
from apps.targets.models import Target
target_pk = self.kwargs.get('target_pk')
if not target_pk:
return Response(
{'error': '必须在目标下批量创建端点'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Must create endpoints under a target',
status_code=status.HTTP_400_BAD_REQUEST
)
# 获取目标
try:
target = Target.objects.get(pk=target_pk)
except Target.DoesNotExist:
return Response(
{'error': '目标不存在'},
status=status.HTTP_404_NOT_FOUND
return error_response(
code=ErrorCodes.NOT_FOUND,
message='Target not found',
status_code=status.HTTP_404_NOT_FOUND
)
# 获取请求体中的 URL 列表
urls = request.data.get('urls', [])
if not urls or not isinstance(urls, list):
return Response(
{'error': '请求体不能为空或格式错误'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Request body cannot be empty or invalid format',
status_code=status.HTTP_400_BAD_REQUEST
)
# 调用 service 层处理
@@ -594,21 +614,21 @@ class EndpointViewSet(viewsets.ModelViewSet):
)
except Exception as e:
logger.exception("批量创建端点失败")
return Response(
{'error': '服务器内部错误'},
status=status.HTTP_500_INTERNAL_SERVER_ERROR
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Server internal error',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
return Response({
'message': '批量创建完成',
return success_response(data={
'createdCount': created_count,
}, status=status.HTTP_200_OK)
})
@action(detail=False, methods=['get'], url_path='export')
def export(self, request, **kwargs):
"""导出端点为 CSV 格式
CSV 列url, host, location, title, status_code, content_length, content_type, webserver, tech, body_preview, vhost, matched_gf_patterns, created_at
CSV 列url, host, location, title, status_code, content_length, content_type, webserver, tech, response_body, response_headers, vhost, matched_gf_patterns, created_at
"""
from apps.common.utils import generate_csv_rows, format_datetime, format_list_field
@@ -621,7 +641,7 @@ class EndpointViewSet(viewsets.ModelViewSet):
headers = [
'url', 'host', 'location', 'title', 'status_code',
'content_length', 'content_type', 'webserver', 'tech',
'body_preview', 'vhost', 'matched_gf_patterns', 'created_at'
'response_body', 'response_headers', 'vhost', 'matched_gf_patterns', 'created_at'
]
formatters = {
'created_at': format_datetime,
@@ -833,7 +853,7 @@ class WebsiteSnapshotViewSet(viewsets.ModelViewSet):
def export(self, request, **kwargs):
"""导出网站快照为 CSV 格式
CSV 列url, host, location, title, status_code, content_length, content_type, webserver, tech, body_preview, vhost, created_at
CSV 列url, host, location, title, status_code, content_length, content_type, webserver, tech, response_body, response_headers, vhost, created_at
"""
from apps.common.utils import generate_csv_rows, format_datetime, format_list_field
@@ -846,7 +866,7 @@ class WebsiteSnapshotViewSet(viewsets.ModelViewSet):
headers = [
'url', 'host', 'location', 'title', 'status_code',
'content_length', 'content_type', 'webserver', 'tech',
'body_preview', 'vhost', 'created_at'
'response_body', 'response_headers', 'vhost', 'created_at'
]
formatters = {
'created_at': format_datetime,
@@ -950,7 +970,7 @@ class EndpointSnapshotViewSet(viewsets.ModelViewSet):
def export(self, request, **kwargs):
"""导出端点快照为 CSV 格式
CSV 列url, host, location, title, status_code, content_length, content_type, webserver, tech, body_preview, vhost, matched_gf_patterns, created_at
CSV 列url, host, location, title, status_code, content_length, content_type, webserver, tech, response_body, response_headers, vhost, matched_gf_patterns, created_at
"""
from apps.common.utils import generate_csv_rows, format_datetime, format_list_field
@@ -963,7 +983,7 @@ class EndpointSnapshotViewSet(viewsets.ModelViewSet):
headers = [
'url', 'host', 'location', 'title', 'status_code',
'content_length', 'content_type', 'webserver', 'tech',
'body_preview', 'vhost', 'matched_gf_patterns', 'created_at'
'response_body', 'response_headers', 'vhost', 'matched_gf_patterns', 'created_at'
]
formatters = {
'created_at': format_datetime,

View File

@@ -40,8 +40,14 @@ def fetch_config_and_setup_django():
print(f"[CONFIG] 正在从配置中心获取配置: {config_url}")
print(f"[CONFIG] IS_LOCAL={is_local}")
try:
# 构建请求头(包含 Worker API Key
headers = {}
worker_api_key = os.environ.get("WORKER_API_KEY", "")
if worker_api_key:
headers["X-Worker-API-Key"] = worker_api_key
# verify=False: 远程 Worker 通过 HTTPS 访问时可能使用自签名证书
resp = requests.get(config_url, timeout=10, verify=False)
resp = requests.get(config_url, headers=headers, timeout=10, verify=False)
resp.raise_for_status()
config = resp.json()
@@ -57,28 +63,17 @@ def fetch_config_and_setup_django():
os.environ.setdefault("DB_USER", db_user)
os.environ.setdefault("DB_PASSWORD", config['db']['password'])
# Redis 配置
os.environ.setdefault("REDIS_URL", config['redisUrl'])
# 日志配置
os.environ.setdefault("LOG_DIR", config['paths']['logs'])
os.environ.setdefault("LOG_LEVEL", config['logging']['level'])
os.environ.setdefault("ENABLE_COMMAND_LOGGING", str(config['logging']['enableCommandLogging']).lower())
os.environ.setdefault("DEBUG", str(config['debug']))
# Git 加速配置(用于 Git clone 加速)
git_mirror = config.get('gitMirror', '')
if git_mirror:
os.environ.setdefault("GIT_MIRROR", git_mirror)
print(f"[CONFIG] ✓ 配置获取成功")
print(f"[CONFIG] DB_HOST: {db_host}")
print(f"[CONFIG] DB_PORT: {db_port}")
print(f"[CONFIG] DB_NAME: {db_name}")
print(f"[CONFIG] DB_USER: {db_user}")
print(f"[CONFIG] REDIS_URL: {config['redisUrl']}")
if git_mirror:
print(f"[CONFIG] GIT_MIRROR: {git_mirror}")
except Exception as e:
print(f"[ERROR] 获取配置失败: {config_url} - {e}", file=sys.stderr)

View File

@@ -0,0 +1,31 @@
"""
标准化错误码定义
采用简化方案(参考 Stripe、GitHub 等大厂做法):
- 只定义 5-10 个通用错误码
- 未知错误使用通用错误码
- 错误码格式:大写字母和下划线组成
"""
class ErrorCodes:
"""标准化错误码
只定义通用错误码,其他错误使用通用消息。
这是 Stripe、GitHub 等大厂的标准做法。
错误码格式规范:
- 使用大写字母和下划线
- 简洁明了,易于理解
- 前端通过错误码映射到 i18n 键
"""
# 通用错误码8 个)
VALIDATION_ERROR = 'VALIDATION_ERROR' # 输入验证失败
NOT_FOUND = 'NOT_FOUND' # 资源未找到
PERMISSION_DENIED = 'PERMISSION_DENIED' # 权限不足
SERVER_ERROR = 'SERVER_ERROR' # 服务器内部错误
BAD_REQUEST = 'BAD_REQUEST' # 请求格式错误
CONFLICT = 'CONFLICT' # 资源冲突(如重复创建)
UNAUTHORIZED = 'UNAUTHORIZED' # 未认证
RATE_LIMITED = 'RATE_LIMITED' # 请求过于频繁

View File

@@ -0,0 +1,49 @@
"""
自定义异常处理器
统一处理 DRF 异常,确保错误响应格式一致
"""
from rest_framework.views import exception_handler
from rest_framework import status
from rest_framework.exceptions import AuthenticationFailed, NotAuthenticated
from apps.common.response_helpers import error_response
from apps.common.error_codes import ErrorCodes
def custom_exception_handler(exc, context):
"""
自定义异常处理器
处理认证相关异常,返回统一格式的错误响应
"""
# 先调用 DRF 默认的异常处理器
response = exception_handler(exc, context)
if response is not None:
# 处理 401 未认证错误
if response.status_code == status.HTTP_401_UNAUTHORIZED:
return error_response(
code=ErrorCodes.UNAUTHORIZED,
message='Authentication required',
status_code=status.HTTP_401_UNAUTHORIZED
)
# 处理 403 权限不足错误
if response.status_code == status.HTTP_403_FORBIDDEN:
return error_response(
code=ErrorCodes.PERMISSION_DENIED,
message='Permission denied',
status_code=status.HTTP_403_FORBIDDEN
)
# 处理 NotAuthenticated 和 AuthenticationFailed 异常
if isinstance(exc, (NotAuthenticated, AuthenticationFailed)):
return error_response(
code=ErrorCodes.UNAUTHORIZED,
message='Authentication required',
status_code=status.HTTP_401_UNAUTHORIZED
)
return response

View File

@@ -0,0 +1,80 @@
"""
集中式权限管理
实现三类端点的认证逻辑:
1. 公开端点(无需认证):登录、登出、获取当前用户状态
2. Worker 端点API Key 认证):注册、配置、心跳、回调、资源同步
3. 业务端点Session 认证):其他所有 API
"""
import re
import logging
from django.conf import settings
from rest_framework.permissions import BasePermission
logger = logging.getLogger(__name__)
# 公开端点白名单(无需任何认证)
PUBLIC_ENDPOINTS = [
r'^/api/auth/login/$',
r'^/api/auth/logout/$',
r'^/api/auth/me/$',
]
# Worker API 端点(需要 API Key 认证)
# 包括:注册、配置、心跳、回调、资源同步(字典下载)
WORKER_ENDPOINTS = [
r'^/api/workers/register/$',
r'^/api/workers/config/$',
r'^/api/workers/\d+/heartbeat/$',
r'^/api/callbacks/',
# 资源同步端点Worker 需要下载字典文件)
r'^/api/wordlists/download/$',
# 注意:指纹导出 API 使用 Session 认证(前端用户导出用)
# Worker 通过数据库直接获取指纹数据,不需要 HTTP API
]
class IsAuthenticatedOrPublic(BasePermission):
"""
自定义权限类:
- 白名单内的端点公开访问
- Worker 端点需要 API Key 认证
- 其他端点需要 Session 认证
"""
def has_permission(self, request, view):
path = request.path
# 检查是否在公开白名单内
for pattern in PUBLIC_ENDPOINTS:
if re.match(pattern, path):
return True
# 检查是否是 Worker 端点
for pattern in WORKER_ENDPOINTS:
if re.match(pattern, path):
return self._check_worker_api_key(request)
# 其他路径需要 Session 认证
return request.user and request.user.is_authenticated
def _check_worker_api_key(self, request):
"""验证 Worker API Key"""
api_key = request.headers.get('X-Worker-API-Key')
expected_key = getattr(settings, 'WORKER_API_KEY', None)
if not expected_key:
# 未配置 API Key 时,拒绝所有 Worker 请求
logger.warning("WORKER_API_KEY 未配置,拒绝 Worker 请求")
return False
if not api_key:
logger.warning(f"Worker 请求缺少 X-Worker-API-Key Header: {request.path}")
return False
if api_key != expected_key:
logger.warning(f"Worker API Key 无效: {request.path}")
return False
return True

View File

@@ -0,0 +1,88 @@
"""
标准化 API 响应辅助函数
遵循行业标准RFC 9457 Problem Details和大厂实践Google、Stripe、GitHub
- 成功响应只包含数据,不包含 message 字段
- 错误响应使用机器可读的错误码,前端映射到 i18n 消息
"""
from typing import Any, Dict, List, Optional, Union
from rest_framework import status
from rest_framework.response import Response
def success_response(
data: Optional[Union[Dict[str, Any], List[Any]]] = None,
status_code: int = status.HTTP_200_OK
) -> Response:
"""
标准化成功响应
直接返回数据,不做包装,符合 Stripe/GitHub 等大厂标准。
Args:
data: 响应数据dict 或 list
status_code: HTTP 状态码,默认 200
Returns:
Response: DRF Response 对象
Examples:
# 单个资源
>>> success_response(data={'id': 1, 'name': 'Test'})
{'id': 1, 'name': 'Test'}
# 操作结果
>>> success_response(data={'count': 3, 'scans': [...]})
{'count': 3, 'scans': [...]}
# 创建资源
>>> success_response(data={'id': 1}, status_code=201)
"""
# 注意:不能使用 data or {},因为空列表 [] 会被转换为 {}
if data is None:
data = {}
return Response(data, status=status_code)
def error_response(
code: str,
message: Optional[str] = None,
details: Optional[List[Dict[str, Any]]] = None,
status_code: int = status.HTTP_400_BAD_REQUEST
) -> Response:
"""
标准化错误响应
Args:
code: 错误码(如 'VALIDATION_ERROR', 'NOT_FOUND'
格式:大写字母和下划线组成
message: 开发者调试信息(非用户显示)
details: 详细错误信息(如字段级验证错误)
status_code: HTTP 状态码,默认 400
Returns:
Response: DRF Response 对象
Examples:
# 简单错误
>>> error_response(code='NOT_FOUND', status_code=404)
{'error': {'code': 'NOT_FOUND'}}
# 带调试信息
>>> error_response(
... code='VALIDATION_ERROR',
... message='Invalid input data',
... details=[{'field': 'name', 'message': 'Required'}]
... )
{'error': {'code': 'VALIDATION_ERROR', 'message': '...', 'details': [...]}}
"""
error_body: Dict[str, Any] = {'code': code}
if message:
error_body['message'] = message
if details:
error_body['details'] = details
return Response({'error': error_body}, status=status_code)

View File

@@ -4,15 +4,28 @@
提供系统日志的读取功能,支持:
- 从日志目录读取日志文件
- 限制返回行数,防止内存溢出
- 列出可用的日志文件
"""
import fnmatch
import logging
import os
import subprocess
from datetime import datetime, timezone
from typing import TypedDict
logger = logging.getLogger(__name__)
class LogFileInfo(TypedDict):
"""日志文件信息"""
filename: str
category: str # 'system' | 'error' | 'performance' | 'container'
size: int
modifiedAt: str # ISO 8601 格式
class SystemLogService:
"""
系统日志服务类
@@ -20,23 +33,131 @@ class SystemLogService:
负责读取系统日志文件,支持从容器内路径或宿主机挂载路径读取日志。
"""
# 日志文件分类规则
CATEGORY_RULES = [
('xingrin.log', 'system'),
('xingrin_error.log', 'error'),
('performance.log', 'performance'),
('container_*.log', 'container'),
]
def __init__(self):
# 日志文件路径(统一使用 /opt/xingrin/logs
self.log_file = "/opt/xingrin/logs/xingrin.log"
self.default_lines = 200 # 默认返回行数
self.max_lines = 10000 # 最大返回行数限制
self.timeout_seconds = 3 # tail 命令超时时间
# 日志目录路径
self.log_dir = "/opt/xingrin/logs"
self.default_file = "xingrin.log" # 默认日志文件
self.default_lines = 200 # 默认返回行数
self.max_lines = 10000 # 最大返回行数限制
self.timeout_seconds = 3 # tail 命令超时时间
def get_logs_content(self, lines: int | None = None) -> str:
def _categorize_file(self, filename: str) -> str | None:
"""
根据文件名判断日志分类
Returns:
分类名称,如果不是日志文件则返回 None
"""
for pattern, category in self.CATEGORY_RULES:
if fnmatch.fnmatch(filename, pattern):
return category
return None
def _validate_filename(self, filename: str) -> bool:
"""
验证文件名是否合法(防止路径遍历攻击)
Args:
filename: 要验证的文件名
Returns:
bool: 文件名是否合法
"""
# 不允许包含路径分隔符
if '/' in filename or '\\' in filename:
return False
# 不允许 .. 路径遍历
if '..' in filename:
return False
# 必须是已知的日志文件类型
return self._categorize_file(filename) is not None
def get_log_files(self) -> list[LogFileInfo]:
"""
获取所有可用的日志文件列表
Returns:
日志文件信息列表,按分类和文件名排序
"""
files: list[LogFileInfo] = []
if not os.path.isdir(self.log_dir):
logger.warning("日志目录不存在: %s", self.log_dir)
return files
for filename in os.listdir(self.log_dir):
filepath = os.path.join(self.log_dir, filename)
# 只处理文件,跳过目录
if not os.path.isfile(filepath):
continue
# 判断分类
category = self._categorize_file(filename)
if category is None:
continue
# 获取文件信息
try:
stat = os.stat(filepath)
modified_at = datetime.fromtimestamp(
stat.st_mtime, tz=timezone.utc
).isoformat()
files.append({
'filename': filename,
'category': category,
'size': stat.st_size,
'modifiedAt': modified_at,
})
except OSError as e:
logger.warning("获取文件信息失败 %s: %s", filepath, e)
continue
# 排序按分类优先级system > error > performance > container然后按文件名
category_order = {'system': 0, 'error': 1, 'performance': 2, 'container': 3}
files.sort(key=lambda f: (category_order.get(f['category'], 99), f['filename']))
return files
def get_logs_content(self, filename: str | None = None, lines: int | None = None) -> str:
"""
获取系统日志内容
Args:
filename: 日志文件名,默认为 xingrin.log
lines: 返回的日志行数,默认 200 行,最大 10000 行
Returns:
str: 日志内容,每行以换行符分隔,保持原始顺序
Raises:
ValueError: 文件名不合法
FileNotFoundError: 日志文件不存在
"""
# 文件名处理
if filename is None:
filename = self.default_file
# 验证文件名
if not self._validate_filename(filename):
raise ValueError(f"无效的文件名: {filename}")
# 构建完整路径
log_file = os.path.join(self.log_dir, filename)
# 检查文件是否存在
if not os.path.isfile(log_file):
raise FileNotFoundError(f"日志文件不存在: {filename}")
# 参数校验和默认值处理
if lines is None:
lines = self.default_lines
@@ -48,7 +169,7 @@ class SystemLogService:
lines = self.max_lines
# 使用 tail 命令读取日志文件末尾内容
cmd = ["tail", "-n", str(lines), self.log_file]
cmd = ["tail", "-n", str(lines), log_file]
result = subprocess.run(
cmd,

View File

@@ -2,14 +2,18 @@
通用模块 URL 配置
路由说明:
- /api/health/ 健康检查接口(无需认证)
- /api/auth/* 认证相关接口(登录、登出、用户信息)
- /api/system/* 系统管理接口(日志查看等)
"""
from django.urls import path
from .views import LoginView, LogoutView, MeView, ChangePasswordView, SystemLogsView
from .views import LoginView, LogoutView, MeView, ChangePasswordView, SystemLogsView, SystemLogFilesView, HealthCheckView
urlpatterns = [
# 健康检查(无需认证)
path('health/', HealthCheckView.as_view(), name='health-check'),
# 认证相关
path('auth/login/', LoginView.as_view(), name='auth-login'),
path('auth/logout/', LogoutView.as_view(), name='auth-logout'),
@@ -18,4 +22,5 @@ urlpatterns = [
# 系统管理
path('system/logs/', SystemLogsView.as_view(), name='system-logs'),
path('system/logs/files/', SystemLogFilesView.as_view(), name='system-log-files'),
]

View File

@@ -13,7 +13,6 @@ from .csv_utils import (
format_datetime,
UTF8_BOM,
)
from .git_proxy import get_git_proxy_url
__all__ = [
'deduplicate_for_bulk',
@@ -26,5 +25,4 @@ __all__ = [
'format_list_field',
'format_datetime',
'UTF8_BOM',
'get_git_proxy_url',
]

View File

@@ -29,11 +29,19 @@ from dataclasses import dataclass
from typing import List, Dict, Optional, Union
from enum import Enum
from django.db.models import QuerySet, Q
from django.db.models import QuerySet, Q, F, Func, CharField
from django.db.models.functions import Cast
logger = logging.getLogger(__name__)
class ArrayToString(Func):
"""PostgreSQL array_to_string 函数"""
function = 'array_to_string'
template = "%(function)s(%(expressions)s, ',')"
output_field = CharField()
class LogicalOp(Enum):
"""逻辑运算符"""
AND = 'AND'
@@ -86,9 +94,21 @@ class QueryParser:
if not query_string or not query_string.strip():
return []
# 第一步:提取所有过滤条件并用占位符替换,保护引号内的空格
filters_found = []
placeholder_pattern = '__FILTER_{}__'
def replace_filter(match):
idx = len(filters_found)
filters_found.append(match.group(0))
return placeholder_pattern.format(idx)
# 先用正则提取所有 field="value" 形式的条件
protected = cls.FILTER_PATTERN.sub(replace_filter, query_string)
# 标准化逻辑运算符
# 先处理 || 和 or -> __OR__
normalized = cls.OR_PATTERN.sub(' __OR__ ', query_string)
normalized = cls.OR_PATTERN.sub(' __OR__ ', protected)
# 再处理 && 和 and -> __AND__
normalized = cls.AND_PATTERN.sub(' __AND__ ', normalized)
@@ -103,20 +123,26 @@ class QueryParser:
pending_op = LogicalOp.OR
elif token == '__AND__':
pending_op = LogicalOp.AND
else:
# 尝试解析为过滤条件
match = cls.FILTER_PATTERN.match(token)
if match:
field, operator, value = match.groups()
groups.append(FilterGroup(
filter=ParsedFilter(
field=field.lower(),
operator=operator,
value=value
),
logical_op=pending_op if groups else LogicalOp.AND # 第一个条件默认 AND
))
pending_op = LogicalOp.AND # 重置为默认 AND
elif token.startswith('__FILTER_') and token.endswith('__'):
# 还原占位符为原始过滤条件
try:
idx = int(token[9:-2]) # 提取索引
original_filter = filters_found[idx]
match = cls.FILTER_PATTERN.match(original_filter)
if match:
field, operator, value = match.groups()
groups.append(FilterGroup(
filter=ParsedFilter(
field=field.lower(),
operator=operator,
value=value
),
logical_op=pending_op if groups else LogicalOp.AND
))
pending_op = LogicalOp.AND # 重置为默认 AND
except (ValueError, IndexError):
pass
# 其他 token 忽略(无效输入)
return groups
@@ -151,6 +177,21 @@ class QueryBuilder:
json_array_fields = json_array_fields or []
# 收集需要 annotate 的数组模糊搜索字段
array_fuzzy_fields = set()
# 第一遍:检查是否有数组模糊匹配
for group in filter_groups:
f = group.filter
db_field = field_mapping.get(f.field)
if db_field and db_field in json_array_fields and f.operator == '=':
array_fuzzy_fields.add(db_field)
# 对数组模糊搜索字段做 annotate
for field in array_fuzzy_fields:
annotate_name = f'{field}_text'
queryset = queryset.annotate(**{annotate_name: ArrayToString(F(field))})
# 构建 Q 对象
combined_q = None
@@ -187,8 +228,17 @@ class QueryBuilder:
def _build_single_q(cls, field: str, operator: str, value: str, is_json_array: bool = False) -> Optional[Q]:
"""构建单个条件的 Q 对象"""
if is_json_array:
# JSON 数组字段使用 __contains 查询
return Q(**{f'{field}__contains': [value]})
if operator == '==':
# 精确匹配:数组中包含完全等于 value 的元素
return Q(**{f'{field}__contains': [value]})
elif operator == '!=':
# 不包含:数组中不包含完全等于 value 的元素
return ~Q(**{f'{field}__contains': [value]})
else: # '=' 模糊匹配
# 使用 annotate 后的字段进行模糊搜索
# 字段已在 build_query 中通过 ArrayToString 转换为文本
annotate_name = f'{field}_text'
return Q(**{f'{annotate_name}__icontains': value})
if operator == '!=':
return cls._build_not_equal_q(field, value)

View File

@@ -1,39 +0,0 @@
"""Git proxy utilities for URL acceleration."""
import os
from urllib.parse import urlparse
def get_git_proxy_url(original_url: str) -> str:
"""
Convert Git repository URL to proxy format for acceleration.
Supports multiple mirror services (standard format):
- gh-proxy.org: https://gh-proxy.org/https://github.com/user/repo.git
- ghproxy.com: https://ghproxy.com/https://github.com/user/repo.git
- mirror.ghproxy.com: https://mirror.ghproxy.com/https://github.com/user/repo.git
- ghps.cc: https://ghps.cc/https://github.com/user/repo.git
Args:
original_url: Original repository URL, e.g., https://github.com/user/repo.git
Returns:
Converted URL based on GIT_MIRROR setting.
If GIT_MIRROR is not set, returns the original URL unchanged.
"""
git_mirror = os.getenv("GIT_MIRROR", "").strip()
if not git_mirror:
return original_url
# Remove trailing slash from mirror URL if present
git_mirror = git_mirror.rstrip("/")
parsed = urlparse(original_url)
host = parsed.netloc.lower()
# Only support GitHub for now
if "github.com" not in host:
return original_url
# Standard format: https://mirror.example.com/https://github.com/user/repo.git
return f"{git_mirror}/{original_url}"

View File

@@ -2,11 +2,17 @@
通用模块视图导出
包含:
- 健康检查视图Docker 健康检查
- 认证相关视图:登录、登出、用户信息、修改密码
- 系统日志视图:实时日志查看
"""
from .health_views import HealthCheckView
from .auth_views import LoginView, LogoutView, MeView, ChangePasswordView
from .system_log_views import SystemLogsView
from .system_log_views import SystemLogsView, SystemLogFilesView
__all__ = ['LoginView', 'LogoutView', 'MeView', 'ChangePasswordView', 'SystemLogsView']
__all__ = [
'HealthCheckView',
'LoginView', 'LogoutView', 'MeView', 'ChangePasswordView',
'SystemLogsView', 'SystemLogFilesView',
]

View File

@@ -9,7 +9,10 @@ from django.utils.decorators import method_decorator
from rest_framework import status
from rest_framework.views import APIView
from rest_framework.response import Response
from rest_framework.permissions import AllowAny, IsAuthenticated
from rest_framework.permissions import AllowAny
from apps.common.response_helpers import success_response, error_response
from apps.common.error_codes import ErrorCodes
logger = logging.getLogger(__name__)
@@ -28,9 +31,10 @@ class LoginView(APIView):
password = request.data.get('password')
if not username or not password:
return Response(
{'error': '请提供用户名和密码'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Username and password are required',
status_code=status.HTTP_400_BAD_REQUEST
)
user = authenticate(request, username=username, password=password)
@@ -38,20 +42,22 @@ class LoginView(APIView):
if user is not None:
login(request, user)
logger.info(f"用户 {username} 登录成功")
return Response({
'message': '登录成功',
'user': {
'id': user.id,
'username': user.username,
'isStaff': user.is_staff,
'isSuperuser': user.is_superuser,
return success_response(
data={
'user': {
'id': user.id,
'username': user.username,
'isStaff': user.is_staff,
'isSuperuser': user.is_superuser,
}
}
})
)
else:
logger.warning(f"用户 {username} 登录失败:用户名或密码错误")
return Response(
{'error': '用户名或密码错误'},
status=status.HTTP_401_UNAUTHORIZED
return error_response(
code=ErrorCodes.UNAUTHORIZED,
message='Invalid username or password',
status_code=status.HTTP_401_UNAUTHORIZED
)
@@ -79,7 +85,7 @@ class LogoutView(APIView):
logout(request)
else:
logout(request)
return Response({'message': '已登出'})
return success_response()
@method_decorator(csrf_exempt, name='dispatch')
@@ -100,22 +106,26 @@ class MeView(APIView):
if user_id:
try:
user = User.objects.get(pk=user_id)
return Response({
'authenticated': True,
'user': {
'id': user.id,
'username': user.username,
'isStaff': user.is_staff,
'isSuperuser': user.is_superuser,
return success_response(
data={
'authenticated': True,
'user': {
'id': user.id,
'username': user.username,
'isStaff': user.is_staff,
'isSuperuser': user.is_superuser,
}
}
})
)
except User.DoesNotExist:
pass
return Response({
'authenticated': False,
'user': None
})
return success_response(
data={
'authenticated': False,
'user': None
}
)
@method_decorator(csrf_exempt, name='dispatch')
@@ -124,43 +134,27 @@ class ChangePasswordView(APIView):
修改密码
POST /api/auth/change-password/
"""
authentication_classes = [] # 禁用认证(绕过 CSRF
permission_classes = [AllowAny] # 手动检查登录状态
def post(self, request):
# 手动检查登录状态(从 session 获取用户
from django.contrib.auth import get_user_model
User = get_user_model()
user_id = request.session.get('_auth_user_id')
if not user_id:
return Response(
{'error': '请先登录'},
status=status.HTTP_401_UNAUTHORIZED
)
try:
user = User.objects.get(pk=user_id)
except User.DoesNotExist:
return Response(
{'error': '用户不存在'},
status=status.HTTP_401_UNAUTHORIZED
)
# 使用全局权限类验证request.user 已经是认证用户
user = request.user
# CamelCaseParser 将 oldPassword -> old_password
old_password = request.data.get('old_password')
new_password = request.data.get('new_password')
if not old_password or not new_password:
return Response(
{'error': '请提供旧密码和新密码'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Old password and new password are required',
status_code=status.HTTP_400_BAD_REQUEST
)
if not user.check_password(old_password):
return Response(
{'error': '旧密码错误'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Old password is incorrect',
status_code=status.HTTP_400_BAD_REQUEST
)
user.set_password(new_password)
@@ -170,4 +164,4 @@ class ChangePasswordView(APIView):
update_session_auth_hash(request, user)
logger.info(f"用户 {user.username} 已修改密码")
return Response({'message': '密码修改成功'})
return success_response()

View File

@@ -0,0 +1,24 @@
"""
健康检查视图
提供 Docker 健康检查端点,无需认证。
"""
from rest_framework.views import APIView
from rest_framework.response import Response
from rest_framework.permissions import AllowAny
class HealthCheckView(APIView):
"""
健康检查端点
GET /api/health/
返回服务状态,用于 Docker 健康检查。
此端点无需认证。
"""
permission_classes = [AllowAny]
def get(self, request):
return Response({'status': 'ok'})

View File

@@ -9,16 +9,57 @@ import logging
from django.utils.decorators import method_decorator
from django.views.decorators.csrf import csrf_exempt
from rest_framework import status
from rest_framework.permissions import AllowAny
from rest_framework.response import Response
from rest_framework.views import APIView
from apps.common.response_helpers import success_response, error_response
from apps.common.error_codes import ErrorCodes
from apps.common.services.system_log_service import SystemLogService
logger = logging.getLogger(__name__)
@method_decorator(csrf_exempt, name="dispatch")
class SystemLogFilesView(APIView):
"""
日志文件列表 API 视图
GET /api/system/logs/files/
获取所有可用的日志文件列表
Response:
{
"files": [
{
"filename": "xingrin.log",
"category": "system",
"size": 1048576,
"modifiedAt": "2025-01-15T10:30:00+00:00"
},
...
]
}
"""
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.service = SystemLogService()
def get(self, request):
"""获取日志文件列表"""
try:
files = self.service.get_log_files()
return success_response(data={"files": files})
except Exception:
logger.exception("获取日志文件列表失败")
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Failed to get log files',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
@method_decorator(csrf_exempt, name="dispatch")
class SystemLogsView(APIView):
"""
@@ -28,21 +69,14 @@ class SystemLogsView(APIView):
获取系统日志内容
Query Parameters:
file (str, optional): 日志文件名,默认 xingrin.log
lines (int, optional): 返回的日志行数,默认 200最大 10000
Response:
{
"content": "日志内容字符串..."
}
Note:
- 当前为开发阶段,暂时允许匿名访问
- 生产环境应添加管理员权限验证
"""
# TODO: 生产环境应改为 IsAdminUser 权限
authentication_classes = []
permission_classes = [AllowAny]
def __init__(self, **kwargs):
super().__init__(**kwargs)
@@ -52,18 +86,33 @@ class SystemLogsView(APIView):
"""
获取系统日志
支持通过 lines 参数控制返回行数,用于前端分页或实时刷新场景
支持通过 file 和 lines 参数控制返回内容
"""
try:
# 解析 lines 参数
# 解析参数
filename = request.query_params.get("file")
lines_raw = request.query_params.get("lines")
lines = int(lines_raw) if lines_raw is not None else None
# 调用服务获取日志内容
content = self.service.get_logs_content(lines=lines)
return Response({"content": content})
except ValueError:
return Response({"error": "lines 参数必须是整数"}, status=status.HTTP_400_BAD_REQUEST)
content = self.service.get_logs_content(filename=filename, lines=lines)
return success_response(data={"content": content})
except ValueError as e:
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message=str(e) if 'file' in str(e).lower() else 'lines must be an integer',
status_code=status.HTTP_400_BAD_REQUEST
)
except FileNotFoundError as e:
return error_response(
code=ErrorCodes.NOT_FOUND,
message=str(e),
status_code=status.HTTP_404_NOT_FOUND
)
except Exception:
logger.exception("获取系统日志失败")
return Response({"error": "获取系统日志失败"}, status=status.HTTP_500_INTERNAL_SERVER_ERROR)
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Failed to get system logs',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)

View File

@@ -0,0 +1,44 @@
"""
WebSocket 认证基类
提供需要认证的 WebSocket Consumer 基类
"""
import logging
from channels.generic.websocket import AsyncWebsocketConsumer
logger = logging.getLogger(__name__)
class AuthenticatedWebsocketConsumer(AsyncWebsocketConsumer):
"""
需要认证的 WebSocket Consumer 基类
子类应该重写 on_connect() 方法实现具体的连接逻辑
"""
async def connect(self):
"""
连接时验证用户认证状态
未认证时使用 close(code=4001) 拒绝连接
"""
user = self.scope.get('user')
if not user or not user.is_authenticated:
logger.warning(
f"WebSocket 连接被拒绝:用户未认证 - Path: {self.scope.get('path')}"
)
await self.close(code=4001)
return
# 调用子类的连接逻辑
await self.on_connect()
async def on_connect(self):
"""
子类实现具体的连接逻辑
默认实现:接受连接
"""
await self.accept()

View File

@@ -6,17 +6,17 @@ import json
import logging
import asyncio
import os
from channels.generic.websocket import AsyncWebsocketConsumer
from asgiref.sync import sync_to_async
from django.conf import settings
from apps.common.websocket_auth import AuthenticatedWebsocketConsumer
from apps.engine.services import WorkerService
logger = logging.getLogger(__name__)
class WorkerDeployConsumer(AsyncWebsocketConsumer):
class WorkerDeployConsumer(AuthenticatedWebsocketConsumer):
"""
Worker 交互式终端 WebSocket Consumer
@@ -31,8 +31,8 @@ class WorkerDeployConsumer(AsyncWebsocketConsumer):
self.read_task = None
self.worker_service = WorkerService()
async def connect(self):
"""连接时加入对应 Worker 的组并自动建立 SSH 连接"""
async def on_connect(self):
"""连接时加入对应 Worker 的组并自动建立 SSH 连接(已通过认证)"""
self.worker_id = self.scope['url_route']['kwargs']['worker_id']
self.group_name = f'worker_deploy_{self.worker_id}'

View File

@@ -15,9 +15,10 @@
"""
from django.core.management.base import BaseCommand
from io import StringIO
from pathlib import Path
import yaml
from ruamel.yaml import YAML
from apps.engine.models import ScanEngine
@@ -44,10 +45,12 @@ class Command(BaseCommand):
with open(config_path, 'r', encoding='utf-8') as f:
default_config = f.read()
# 解析 YAML 为字典,后续用于生成子引擎配置
# 使用 ruamel.yaml 解析,保留注释
yaml_parser = YAML()
yaml_parser.preserve_quotes = True
try:
config_dict = yaml.safe_load(default_config) or {}
except yaml.YAMLError as e:
config_dict = yaml_parser.load(default_config) or {}
except Exception as e:
self.stdout.write(self.style.ERROR(f'引擎配置 YAML 解析失败: {e}'))
return
@@ -83,15 +86,13 @@ class Command(BaseCommand):
if scan_type != 'subdomain_discovery' and 'tools' not in scan_cfg:
continue
# 构造只包含当前扫描类型配置的 YAML
# 构造只包含当前扫描类型配置的 YAML(保留注释)
single_config = {scan_type: scan_cfg}
try:
single_yaml = yaml.safe_dump(
single_config,
sort_keys=False,
allow_unicode=True,
)
except yaml.YAMLError as e:
stream = StringIO()
yaml_parser.dump(single_config, stream)
single_yaml = stream.getvalue()
except Exception as e:
self.stdout.write(self.style.ERROR(f'生成子引擎 {scan_type} 配置失败: {e}'))
continue

View File

@@ -3,6 +3,9 @@
- EHole 指纹: ehole.json -> 导入到数据库
- Goby 指纹: goby.json -> 导入到数据库
- Wappalyzer 指纹: wappalyzer.json -> 导入到数据库
- Fingers 指纹: fingers_http.json -> 导入到数据库
- FingerPrintHub 指纹: fingerprinthub_web.json -> 导入到数据库
- ARL 指纹: ARL.yaml -> 导入到数据库
可重复执行:如果数据库已有数据则跳过,只在空库时导入。
"""
@@ -11,14 +14,25 @@ import json
import logging
from pathlib import Path
import yaml
from django.conf import settings
from django.core.management.base import BaseCommand
from apps.engine.models import EholeFingerprint, GobyFingerprint, WappalyzerFingerprint
from apps.engine.models import (
EholeFingerprint,
GobyFingerprint,
WappalyzerFingerprint,
FingersFingerprint,
FingerPrintHubFingerprint,
ARLFingerprint,
)
from apps.engine.services.fingerprints import (
EholeFingerprintService,
GobyFingerprintService,
WappalyzerFingerprintService,
FingersFingerprintService,
FingerPrintHubFingerprintService,
ARLFingerprintService,
)
@@ -33,6 +47,7 @@ DEFAULT_FINGERPRINTS = [
"model": EholeFingerprint,
"service": EholeFingerprintService,
"data_key": "fingerprint", # JSON 中指纹数组的 key
"file_format": "json",
},
{
"type": "goby",
@@ -40,6 +55,7 @@ DEFAULT_FINGERPRINTS = [
"model": GobyFingerprint,
"service": GobyFingerprintService,
"data_key": None, # Goby 是数组格式,直接使用整个 JSON
"file_format": "json",
},
{
"type": "wappalyzer",
@@ -47,6 +63,31 @@ DEFAULT_FINGERPRINTS = [
"model": WappalyzerFingerprint,
"service": WappalyzerFingerprintService,
"data_key": "apps", # Wappalyzer 使用 apps 对象
"file_format": "json",
},
{
"type": "fingers",
"filename": "fingers_http.json",
"model": FingersFingerprint,
"service": FingersFingerprintService,
"data_key": None, # Fingers 是数组格式
"file_format": "json",
},
{
"type": "fingerprinthub",
"filename": "fingerprinthub_web.json",
"model": FingerPrintHubFingerprint,
"service": FingerPrintHubFingerprintService,
"data_key": None, # FingerPrintHub 是数组格式
"file_format": "json",
},
{
"type": "arl",
"filename": "ARL.yaml",
"model": ARLFingerprint,
"service": ARLFingerprintService,
"data_key": None, # ARL 是 YAML 数组格式
"file_format": "yaml",
},
]
@@ -68,6 +109,7 @@ class Command(BaseCommand):
model = item["model"]
service_class = item["service"]
data_key = item["data_key"]
file_format = item.get("file_format", "json")
# 检查数据库是否已有数据
existing_count = model.objects.count()
@@ -87,11 +129,14 @@ class Command(BaseCommand):
failed += 1
continue
# 读取并解析 JSON
# 读取并解析文件(支持 JSON 和 YAML
try:
with open(src_path, "r", encoding="utf-8") as f:
json_data = json.load(f)
except (json.JSONDecodeError, OSError) as exc:
if file_format == "yaml":
file_data = yaml.safe_load(f)
else:
file_data = json.load(f)
except (json.JSONDecodeError, yaml.YAMLError, OSError) as exc:
self.stdout.write(self.style.ERROR(
f"[{fp_type}] 读取指纹文件失败: {exc}"
))
@@ -99,7 +144,7 @@ class Command(BaseCommand):
continue
# 提取指纹数据(根据不同格式处理)
fingerprints = self._extract_fingerprints(json_data, data_key, fp_type)
fingerprints = self._extract_fingerprints(file_data, data_key, fp_type)
if not fingerprints:
self.stdout.write(self.style.WARNING(
f"[{fp_type}] 指纹文件中没有有效数据,跳过"

View File

@@ -4,7 +4,14 @@
"""
from .engine import WorkerNode, ScanEngine, Wordlist, NucleiTemplateRepo
from .fingerprints import EholeFingerprint, GobyFingerprint, WappalyzerFingerprint
from .fingerprints import (
EholeFingerprint,
GobyFingerprint,
WappalyzerFingerprint,
FingersFingerprint,
FingerPrintHubFingerprint,
ARLFingerprint,
)
__all__ = [
# 核心 Models
@@ -16,4 +23,7 @@ __all__ = [
"EholeFingerprint",
"GobyFingerprint",
"WappalyzerFingerprint",
"FingersFingerprint",
"FingerPrintHubFingerprint",
"ARLFingerprint",
]

View File

@@ -106,3 +106,90 @@ class WappalyzerFingerprint(models.Model):
def __str__(self) -> str:
return f"{self.name}"
class FingersFingerprint(models.Model):
"""Fingers 格式指纹规则 (fingers_http.json)
使用正则表达式和标签进行匹配,支持 favicon hash、header、body 等多种检测方式
"""
name = models.CharField(max_length=300, unique=True, help_text='指纹名称')
link = models.URLField(max_length=500, blank=True, default='', help_text='相关链接')
rule = models.JSONField(default=list, help_text='匹配规则数组')
tag = models.JSONField(default=list, help_text='标签数组')
focus = models.BooleanField(default=False, help_text='是否重点关注')
default_port = models.JSONField(default=list, blank=True, help_text='默认端口数组')
created_at = models.DateTimeField(auto_now_add=True)
class Meta:
db_table = 'fingers_fingerprint'
verbose_name = 'Fingers 指纹'
verbose_name_plural = 'Fingers 指纹'
ordering = ['-created_at']
indexes = [
models.Index(fields=['name']),
models.Index(fields=['link']),
models.Index(fields=['focus']),
models.Index(fields=['-created_at']),
]
def __str__(self) -> str:
return f"{self.name}"
class FingerPrintHubFingerprint(models.Model):
"""FingerPrintHub 格式指纹规则 (fingerprinthub_web.json)
基于 nuclei 模板格式,使用 HTTP 请求和响应特征进行匹配
"""
fp_id = models.CharField(max_length=200, unique=True, help_text='指纹ID')
name = models.CharField(max_length=300, help_text='指纹名称')
author = models.CharField(max_length=200, blank=True, default='', help_text='作者')
tags = models.CharField(max_length=500, blank=True, default='', help_text='标签')
severity = models.CharField(max_length=50, blank=True, default='info', help_text='严重程度')
metadata = models.JSONField(default=dict, blank=True, help_text='元数据')
http = models.JSONField(default=list, help_text='HTTP 匹配规则')
source_file = models.CharField(max_length=500, blank=True, default='', help_text='来源文件')
created_at = models.DateTimeField(auto_now_add=True)
class Meta:
db_table = 'fingerprinthub_fingerprint'
verbose_name = 'FingerPrintHub 指纹'
verbose_name_plural = 'FingerPrintHub 指纹'
ordering = ['-created_at']
indexes = [
models.Index(fields=['fp_id']),
models.Index(fields=['name']),
models.Index(fields=['author']),
models.Index(fields=['severity']),
models.Index(fields=['-created_at']),
]
def __str__(self) -> str:
return f"{self.name} ({self.fp_id})"
class ARLFingerprint(models.Model):
"""ARL 格式指纹规则 (ARL.yaml)
使用简单的 name + rule 表达式格式
"""
name = models.CharField(max_length=300, unique=True, help_text='指纹名称')
rule = models.TextField(help_text='匹配规则表达式')
created_at = models.DateTimeField(auto_now_add=True)
class Meta:
db_table = 'arl_fingerprint'
verbose_name = 'ARL 指纹'
verbose_name_plural = 'ARL 指纹'
ordering = ['-created_at']
indexes = [
models.Index(fields=['name']),
models.Index(fields=['-created_at']),
]
def __str__(self) -> str:
return f"{self.name}"

View File

@@ -6,9 +6,15 @@
from .ehole import EholeFingerprintSerializer
from .goby import GobyFingerprintSerializer
from .wappalyzer import WappalyzerFingerprintSerializer
from .fingers import FingersFingerprintSerializer
from .fingerprinthub import FingerPrintHubFingerprintSerializer
from .arl import ARLFingerprintSerializer
__all__ = [
"EholeFingerprintSerializer",
"GobyFingerprintSerializer",
"WappalyzerFingerprintSerializer",
"FingersFingerprintSerializer",
"FingerPrintHubFingerprintSerializer",
"ARLFingerprintSerializer",
]

View File

@@ -0,0 +1,31 @@
"""ARL 指纹 Serializer"""
from rest_framework import serializers
from apps.engine.models import ARLFingerprint
class ARLFingerprintSerializer(serializers.ModelSerializer):
"""ARL 指纹序列化器
字段映射:
- name: 指纹名称 (必填, 唯一)
- rule: 匹配规则表达式 (必填)
"""
class Meta:
model = ARLFingerprint
fields = ['id', 'name', 'rule', 'created_at']
read_only_fields = ['id', 'created_at']
def validate_name(self, value):
"""校验 name 字段"""
if not value or not value.strip():
raise serializers.ValidationError("name 字段不能为空")
return value.strip()
def validate_rule(self, value):
"""校验 rule 字段"""
if not value or not value.strip():
raise serializers.ValidationError("rule 字段不能为空")
return value.strip()

View File

@@ -0,0 +1,50 @@
"""FingerPrintHub 指纹 Serializer"""
from rest_framework import serializers
from apps.engine.models import FingerPrintHubFingerprint
class FingerPrintHubFingerprintSerializer(serializers.ModelSerializer):
"""FingerPrintHub 指纹序列化器
字段映射:
- fp_id: 指纹ID (必填, 唯一)
- name: 指纹名称 (必填)
- author: 作者 (可选)
- tags: 标签字符串 (可选)
- severity: 严重程度 (可选, 默认 'info')
- metadata: 元数据 JSON (可选)
- http: HTTP 匹配规则数组 (必填)
- source_file: 来源文件 (可选)
"""
class Meta:
model = FingerPrintHubFingerprint
fields = ['id', 'fp_id', 'name', 'author', 'tags', 'severity',
'metadata', 'http', 'source_file', 'created_at']
read_only_fields = ['id', 'created_at']
def validate_fp_id(self, value):
"""校验 fp_id 字段"""
if not value or not value.strip():
raise serializers.ValidationError("fp_id 字段不能为空")
return value.strip()
def validate_name(self, value):
"""校验 name 字段"""
if not value or not value.strip():
raise serializers.ValidationError("name 字段不能为空")
return value.strip()
def validate_http(self, value):
"""校验 http 字段"""
if not isinstance(value, list):
raise serializers.ValidationError("http 必须是数组")
return value
def validate_metadata(self, value):
"""校验 metadata 字段"""
if not isinstance(value, dict):
raise serializers.ValidationError("metadata 必须是对象")
return value

View File

@@ -0,0 +1,48 @@
"""Fingers 指纹 Serializer"""
from rest_framework import serializers
from apps.engine.models import FingersFingerprint
class FingersFingerprintSerializer(serializers.ModelSerializer):
"""Fingers 指纹序列化器
字段映射:
- name: 指纹名称 (必填, 唯一)
- link: 相关链接 (可选)
- rule: 匹配规则数组 (必填)
- tag: 标签数组 (可选)
- focus: 是否重点关注 (可选, 默认 False)
- default_port: 默认端口数组 (可选)
"""
class Meta:
model = FingersFingerprint
fields = ['id', 'name', 'link', 'rule', 'tag', 'focus',
'default_port', 'created_at']
read_only_fields = ['id', 'created_at']
def validate_name(self, value):
"""校验 name 字段"""
if not value or not value.strip():
raise serializers.ValidationError("name 字段不能为空")
return value.strip()
def validate_rule(self, value):
"""校验 rule 字段"""
if not isinstance(value, list):
raise serializers.ValidationError("rule 必须是数组")
return value
def validate_tag(self, value):
"""校验 tag 字段"""
if not isinstance(value, list):
raise serializers.ValidationError("tag 必须是数组")
return value
def validate_default_port(self, value):
"""校验 default_port 字段"""
if not isinstance(value, list):
raise serializers.ValidationError("default_port 必须是数组")
return value

View File

@@ -66,6 +66,7 @@ def get_start_agent_script(
# 替换变量
script = script.replace("{{HEARTBEAT_API_URL}}", heartbeat_api_url or '')
script = script.replace("{{WORKER_ID}}", str(worker_id) if worker_id else '')
script = script.replace("{{WORKER_API_KEY}}", getattr(settings, 'WORKER_API_KEY', ''))
# 注入镜像版本配置(确保远程节点使用相同版本)
docker_user = getattr(settings, 'DOCKER_USER', 'yyhuni')

View File

@@ -7,10 +7,16 @@ from .base import BaseFingerprintService
from .ehole import EholeFingerprintService
from .goby import GobyFingerprintService
from .wappalyzer import WappalyzerFingerprintService
from .fingers_service import FingersFingerprintService
from .fingerprinthub_service import FingerPrintHubFingerprintService
from .arl_service import ARLFingerprintService
__all__ = [
"BaseFingerprintService",
"EholeFingerprintService",
"GobyFingerprintService",
"WappalyzerFingerprintService",
"FingersFingerprintService",
"FingerPrintHubFingerprintService",
"ARLFingerprintService",
]

View File

@@ -0,0 +1,110 @@
"""ARL 指纹管理 Service
实现 ARL 格式指纹的校验、转换和导出逻辑
支持 YAML 格式的导入导出
"""
import logging
import yaml
from apps.engine.models import ARLFingerprint
from .base import BaseFingerprintService
logger = logging.getLogger(__name__)
class ARLFingerprintService(BaseFingerprintService):
"""ARL 指纹管理服务(继承基类,实现 ARL 特定逻辑)"""
model = ARLFingerprint
def validate_fingerprint(self, item: dict) -> bool:
"""
校验单条 ARL 指纹
校验规则:
- name 字段必须存在且非空
- rule 字段必须存在且非空
Args:
item: 单条指纹数据
Returns:
bool: 是否有效
"""
name = item.get('name', '')
rule = item.get('rule', '')
return bool(name and str(name).strip()) and bool(rule and str(rule).strip())
def to_model_data(self, item: dict) -> dict:
"""
转换 ARL YAML 格式为 Model 字段
Args:
item: 原始 ARL YAML 数据
Returns:
dict: Model 字段数据
"""
return {
'name': str(item.get('name', '')).strip(),
'rule': str(item.get('rule', '')).strip(),
}
def get_export_data(self) -> list:
"""
获取导出数据ARL 格式 - 数组,用于 YAML 导出)
Returns:
list: ARL 格式的数据(数组格式)
[
{"name": "...", "rule": "..."},
...
]
"""
fingerprints = self.model.objects.all()
return [
{
'name': fp.name,
'rule': fp.rule,
}
for fp in fingerprints
]
def export_to_yaml(self, output_path: str) -> int:
"""
导出所有指纹到 YAML 文件
Args:
output_path: 输出文件路径
Returns:
int: 导出的指纹数量
"""
data = self.get_export_data()
with open(output_path, 'w', encoding='utf-8') as f:
yaml.dump(data, f, allow_unicode=True, default_flow_style=False, sort_keys=False)
count = len(data)
logger.info("导出 ARL 指纹文件: %s, 数量: %d", output_path, count)
return count
def parse_yaml_import(self, yaml_content: str) -> list:
"""
解析 YAML 格式的导入内容
Args:
yaml_content: YAML 格式的字符串内容
Returns:
list: 解析后的指纹数据列表
Raises:
ValueError: 当 YAML 格式无效时
"""
try:
data = yaml.safe_load(yaml_content)
if not isinstance(data, list):
raise ValueError("ARL YAML 文件必须是数组格式")
return data
except yaml.YAMLError as e:
raise ValueError(f"无效的 YAML 格式: {e}")

View File

@@ -0,0 +1,110 @@
"""FingerPrintHub 指纹管理 Service
实现 FingerPrintHub 格式指纹的校验、转换和导出逻辑
"""
from apps.engine.models import FingerPrintHubFingerprint
from .base import BaseFingerprintService
class FingerPrintHubFingerprintService(BaseFingerprintService):
"""FingerPrintHub 指纹管理服务(继承基类,实现 FingerPrintHub 特定逻辑)"""
model = FingerPrintHubFingerprint
def validate_fingerprint(self, item: dict) -> bool:
"""
校验单条 FingerPrintHub 指纹
校验规则:
- id 字段必须存在且非空
- info 字段必须存在且包含 name
- http 字段必须是数组
Args:
item: 单条指纹数据
Returns:
bool: 是否有效
"""
fp_id = item.get('id', '')
info = item.get('info', {})
http = item.get('http')
if not fp_id or not str(fp_id).strip():
return False
if not isinstance(info, dict) or not info.get('name'):
return False
if not isinstance(http, list):
return False
return True
def to_model_data(self, item: dict) -> dict:
"""
转换 FingerPrintHub JSON 格式为 Model 字段
字段映射(嵌套结构转扁平):
- id (JSON) → fp_id (Model)
- info.name (JSON) → name (Model)
- info.author (JSON) → author (Model)
- info.tags (JSON) → tags (Model)
- info.severity (JSON) → severity (Model)
- info.metadata (JSON) → metadata (Model)
- http (JSON) → http (Model)
- _source_file (JSON) → source_file (Model)
Args:
item: 原始 FingerPrintHub JSON 数据
Returns:
dict: Model 字段数据
"""
info = item.get('info', {})
return {
'fp_id': str(item.get('id', '')).strip(),
'name': str(info.get('name', '')).strip(),
'author': info.get('author', ''),
'tags': info.get('tags', ''),
'severity': info.get('severity', 'info'),
'metadata': info.get('metadata', {}),
'http': item.get('http', []),
'source_file': item.get('_source_file', ''),
}
def get_export_data(self) -> list:
"""
获取导出数据FingerPrintHub JSON 格式 - 数组)
Returns:
list: FingerPrintHub 格式的 JSON 数据(数组格式)
[
{
"id": "...",
"info": {"name": "...", "author": "...", "tags": "...",
"severity": "...", "metadata": {...}},
"http": [...],
"_source_file": "..."
},
...
]
"""
fingerprints = self.model.objects.all()
data = []
for fp in fingerprints:
item = {
'id': fp.fp_id,
'info': {
'name': fp.name,
'author': fp.author,
'tags': fp.tags,
'severity': fp.severity,
'metadata': fp.metadata,
},
'http': fp.http,
}
# 只有当 source_file 非空时才添加该字段
if fp.source_file:
item['_source_file'] = fp.source_file
data.append(item)
return data

View File

@@ -0,0 +1,83 @@
"""Fingers 指纹管理 Service
实现 Fingers 格式指纹的校验、转换和导出逻辑
"""
from apps.engine.models import FingersFingerprint
from .base import BaseFingerprintService
class FingersFingerprintService(BaseFingerprintService):
"""Fingers 指纹管理服务(继承基类,实现 Fingers 特定逻辑)"""
model = FingersFingerprint
def validate_fingerprint(self, item: dict) -> bool:
"""
校验单条 Fingers 指纹
校验规则:
- name 字段必须存在且非空
- rule 字段必须是数组
Args:
item: 单条指纹数据
Returns:
bool: 是否有效
"""
name = item.get('name', '')
rule = item.get('rule')
return bool(name and str(name).strip()) and isinstance(rule, list)
def to_model_data(self, item: dict) -> dict:
"""
转换 Fingers JSON 格式为 Model 字段
字段映射:
- default_port (JSON) → default_port (Model)
Args:
item: 原始 Fingers JSON 数据
Returns:
dict: Model 字段数据
"""
return {
'name': str(item.get('name', '')).strip(),
'link': item.get('link', ''),
'rule': item.get('rule', []),
'tag': item.get('tag', []),
'focus': item.get('focus', False),
'default_port': item.get('default_port', []),
}
def get_export_data(self) -> list:
"""
获取导出数据Fingers JSON 格式 - 数组)
Returns:
list: Fingers 格式的 JSON 数据(数组格式)
[
{"name": "...", "link": "...", "rule": [...], "tag": [...],
"focus": false, "default_port": [...]},
...
]
"""
fingerprints = self.model.objects.all()
data = []
for fp in fingerprints:
item = {
'name': fp.name,
'link': fp.link,
'rule': fp.rule,
'tag': fp.tag,
}
# 只有当 focus 为 True 时才添加该字段(保持与原始格式一致)
if fp.focus:
item['focus'] = fp.focus
# 只有当 default_port 非空时才添加该字段
if fp.default_port:
item['default_port'] = fp.default_port
data.append(item)
return data

View File

@@ -186,7 +186,6 @@ class NucleiTemplateRepoService:
RuntimeError: Git 命令执行失败
"""
import subprocess
from apps.common.utils.git_proxy import get_git_proxy_url
obj = self._get_repo_obj(repo_id)
@@ -197,14 +196,12 @@ class NucleiTemplateRepoService:
cmd: List[str]
action: str
# 获取代理后的 URL如果启用了 Git 加速)
proxied_url = get_git_proxy_url(obj.repo_url)
if proxied_url != obj.repo_url:
logger.info("使用 Git 加速: %s -> %s", obj.repo_url, proxied_url)
# 直接使用原始 URL不再使用 Git 加速)
repo_url = obj.repo_url
# 判断是 clone 还是 pull
if git_dir.is_dir():
# 检查远程地址是否变化(比较原始 URL不是代理 URL
# 检查远程地址是否变化
current_remote = subprocess.run(
["git", "-C", str(local_path), "remote", "get-url", "origin"],
check=False,
@@ -214,13 +211,13 @@ class NucleiTemplateRepoService:
)
current_url = current_remote.stdout.strip() if current_remote.returncode == 0 else ""
# 检查是否需要重新 clone(原始 URL 或代理 URL 变化都需要)
if current_url not in [obj.repo_url, proxied_url]:
# 检查是否需要重新 clone
if current_url != repo_url:
# 远程地址变化,删除旧目录重新 clone
logger.info("nuclei 模板仓库 %s 远程地址变化,重新 clone: %s -> %s", obj.id, current_url, obj.repo_url)
logger.info("nuclei 模板仓库 %s 远程地址变化,重新 clone: %s -> %s", obj.id, current_url, repo_url)
shutil.rmtree(local_path)
local_path.mkdir(parents=True, exist_ok=True)
cmd = ["git", "clone", "--depth", "1", proxied_url, str(local_path)]
cmd = ["git", "clone", "--depth", "1", repo_url, str(local_path)]
action = "clone"
else:
# 已有仓库且地址未变,执行 pull
@@ -231,7 +228,7 @@ class NucleiTemplateRepoService:
if local_path.exists() and not local_path.is_dir():
raise RuntimeError(f"本地路径已存在且不是目录: {local_path}")
# --depth 1 浅克隆,只获取最新提交,节省空间和时间
cmd = ["git", "clone", "--depth", "1", proxied_url, str(local_path)]
cmd = ["git", "clone", "--depth", "1", repo_url, str(local_path)]
action = "clone"
# 执行 Git 命令

View File

@@ -274,7 +274,7 @@ class TaskDistributor:
network_arg = ""
server_url = f"https://{settings.PUBLIC_HOST}:{settings.PUBLIC_PORT}"
# 挂载路径(统一挂载 /opt/xingrin
# 挂载路径(统一挂载 /opt/xingrin,扫描工具在 /opt/xingrin-tools/bin 不受影响
host_xingrin_dir = "/opt/xingrin"
# 环境变量SERVER_URL + IS_LOCAL其他配置容器启动时从配置中心获取
@@ -284,6 +284,7 @@ class TaskDistributor:
env_vars = [
f"-e SERVER_URL={shlex.quote(server_url)}",
f"-e IS_LOCAL={is_local_str}",
f"-e WORKER_API_KEY={shlex.quote(settings.WORKER_API_KEY)}", # Worker API 认证密钥
"-e PREFECT_HOME=/tmp/.prefect", # 设置 Prefect 数据目录到可写位置
"-e PREFECT_SERVER_EPHEMERAL_ENABLED=true", # 启用 ephemeral server本地临时服务器
"-e PREFECT_SERVER_EPHEMERAL_STARTUP_TIMEOUT_SECONDS=120", # 增加启动超时时间
@@ -311,11 +312,10 @@ class TaskDistributor:
# - 本地 Workerinstall.sh 已预拉取镜像,直接使用本地版本
# - 远程 Workerdeploy 时已预拉取镜像,直接使用本地版本
# - 避免每次任务都检查 Docker Hub提升性能和稳定性
# 使用双引号包裹 sh -c 命令,内部 shlex.quote 生成的单引号参数可正确解析
cmd = f'''docker run --rm -d --pull=missing {network_arg} \
{' '.join(env_vars)} \
{' '.join(volumes)} \
{self.docker_image} \
cmd = f'''docker run --rm -d --pull=missing {network_arg} \\
{' '.join(env_vars)} \\
{' '.join(volumes)} \\
{self.docker_image} \\
sh -c "{inner_cmd}"'''
return cmd

View File

@@ -11,6 +11,9 @@ from .views.fingerprints import (
EholeFingerprintViewSet,
GobyFingerprintViewSet,
WappalyzerFingerprintViewSet,
FingersFingerprintViewSet,
FingerPrintHubFingerprintViewSet,
ARLFingerprintViewSet,
)
@@ -24,6 +27,9 @@ router.register(r"nuclei/repos", NucleiTemplateRepoViewSet, basename="nuclei-rep
router.register(r"fingerprints/ehole", EholeFingerprintViewSet, basename="ehole-fingerprint")
router.register(r"fingerprints/goby", GobyFingerprintViewSet, basename="goby-fingerprint")
router.register(r"fingerprints/wappalyzer", WappalyzerFingerprintViewSet, basename="wappalyzer-fingerprint")
router.register(r"fingerprints/fingers", FingersFingerprintViewSet, basename="fingers-fingerprint")
router.register(r"fingerprints/fingerprinthub", FingerPrintHubFingerprintViewSet, basename="fingerprinthub-fingerprint")
router.register(r"fingerprints/arl", ARLFingerprintViewSet, basename="arl-fingerprint")
urlpatterns = [
path("", include(router.urls)),

View File

@@ -7,10 +7,16 @@ from .base import BaseFingerprintViewSet
from .ehole import EholeFingerprintViewSet
from .goby import GobyFingerprintViewSet
from .wappalyzer import WappalyzerFingerprintViewSet
from .fingers import FingersFingerprintViewSet
from .fingerprinthub import FingerPrintHubFingerprintViewSet
from .arl import ARLFingerprintViewSet
__all__ = [
"BaseFingerprintViewSet",
"EholeFingerprintViewSet",
"GobyFingerprintViewSet",
"WappalyzerFingerprintViewSet",
"FingersFingerprintViewSet",
"FingerPrintHubFingerprintViewSet",
"ARLFingerprintViewSet",
]

View File

@@ -0,0 +1,122 @@
"""ARL 指纹管理 ViewSet"""
import yaml
from django.http import HttpResponse
from rest_framework.decorators import action
from rest_framework.exceptions import ValidationError
from apps.common.pagination import BasePagination
from apps.common.response_helpers import success_response
from apps.engine.models import ARLFingerprint
from apps.engine.serializers.fingerprints import ARLFingerprintSerializer
from apps.engine.services.fingerprints import ARLFingerprintService
from .base import BaseFingerprintViewSet
class ARLFingerprintViewSet(BaseFingerprintViewSet):
"""ARL 指纹管理 ViewSet
继承自 BaseFingerprintViewSet提供以下 API
标准 CRUDModelViewSet
- GET / 列表查询(分页)
- POST / 创建单条
- GET /{id}/ 获取详情
- PUT /{id}/ 更新
- DELETE /{id}/ 删除
批量操作(继承自基类):
- POST /batch_create/ 批量创建JSON body
- POST /import_file/ 文件导入multipart/form-data支持 YAML
- POST /bulk-delete/ 批量删除
- POST /delete-all/ 删除所有
- GET /export/ 导出下载YAML 格式)
智能过滤语法filter 参数):
- name="word" 模糊匹配 name 字段
- name=="WordPress" 精确匹配
- rule="body=" 按规则内容筛选
"""
queryset = ARLFingerprint.objects.all()
serializer_class = ARLFingerprintSerializer
pagination_class = BasePagination
service_class = ARLFingerprintService
# 排序配置
ordering_fields = ['created_at', 'name']
ordering = ['-created_at']
# ARL 过滤字段映射
FILTER_FIELD_MAPPING = {
'name': 'name',
'rule': 'rule',
}
def parse_import_data(self, json_data) -> list:
"""
解析 ARL 格式的导入数据JSON 格式)
输入格式:[{...}, {...}] 数组格式
返回:指纹列表
"""
if isinstance(json_data, list):
return json_data
return []
def get_export_filename(self) -> str:
"""导出文件名"""
return 'ARL.yaml'
@action(detail=False, methods=['post'])
def import_file(self, request):
"""
文件导入(支持 YAML 和 JSON 格式)
POST /api/engine/fingerprints/arl/import_file/
请求格式multipart/form-data
- file: YAML 或 JSON 文件
返回:同 batch_create
"""
file = request.FILES.get('file')
if not file:
raise ValidationError('缺少文件')
filename = file.name.lower()
content = file.read().decode('utf-8')
try:
if filename.endswith('.yaml') or filename.endswith('.yml'):
# YAML 格式
fingerprints = yaml.safe_load(content)
else:
# JSON 格式
import json
fingerprints = json.loads(content)
except (yaml.YAMLError, json.JSONDecodeError) as e:
raise ValidationError(f'无效的文件格式: {e}')
if not isinstance(fingerprints, list):
raise ValidationError('文件内容必须是数组格式')
if not fingerprints:
raise ValidationError('文件中没有有效的指纹数据')
result = self.get_service().batch_create_fingerprints(fingerprints)
return success_response(data=result)
@action(detail=False, methods=['get'])
def export(self, request):
"""
导出指纹YAML 格式)
GET /api/engine/fingerprints/arl/export/
返回YAML 文件下载
"""
data = self.get_service().get_export_data()
content = yaml.dump(data, allow_unicode=True, default_flow_style=False, sort_keys=False)
response = HttpResponse(content, content_type='application/x-yaml')
response['Content-Disposition'] = f'attachment; filename="{self.get_export_filename()}"'
return response

View File

@@ -13,6 +13,7 @@ from rest_framework.response import Response
from rest_framework.exceptions import ValidationError
from apps.common.pagination import BasePagination
from apps.common.response_helpers import success_response
from apps.common.utils.filter_utils import apply_filters
logger = logging.getLogger(__name__)
@@ -129,7 +130,7 @@ class BaseFingerprintViewSet(viewsets.ModelViewSet):
raise ValidationError('fingerprints 必须是数组')
result = self.get_service().batch_create_fingerprints(fingerprints)
return Response(result, status=status.HTTP_201_CREATED)
return success_response(data=result, status_code=status.HTTP_201_CREATED)
@action(detail=False, methods=['post'])
def import_file(self, request):
@@ -156,7 +157,7 @@ class BaseFingerprintViewSet(viewsets.ModelViewSet):
raise ValidationError('文件中没有有效的指纹数据')
result = self.get_service().batch_create_fingerprints(fingerprints)
return Response(result, status=status.HTTP_201_CREATED)
return success_response(data=result, status_code=status.HTTP_201_CREATED)
@action(detail=False, methods=['post'], url_path='bulk-delete')
def bulk_delete(self, request):
@@ -174,7 +175,7 @@ class BaseFingerprintViewSet(viewsets.ModelViewSet):
raise ValidationError('ids 必须是数组')
deleted_count = self.queryset.model.objects.filter(id__in=ids).delete()[0]
return Response({'deleted': deleted_count})
return success_response(data={'deleted': deleted_count})
@action(detail=False, methods=['post'], url_path='delete-all')
def delete_all(self, request):
@@ -185,7 +186,7 @@ class BaseFingerprintViewSet(viewsets.ModelViewSet):
返回:{"deleted": 1000}
"""
deleted_count = self.queryset.model.objects.all().delete()[0]
return Response({'deleted': deleted_count})
return success_response(data={'deleted': deleted_count})
@action(detail=False, methods=['get'])
def export(self, request):

View File

@@ -0,0 +1,73 @@
"""FingerPrintHub 指纹管理 ViewSet"""
from apps.common.pagination import BasePagination
from apps.engine.models import FingerPrintHubFingerprint
from apps.engine.serializers.fingerprints import FingerPrintHubFingerprintSerializer
from apps.engine.services.fingerprints import FingerPrintHubFingerprintService
from .base import BaseFingerprintViewSet
class FingerPrintHubFingerprintViewSet(BaseFingerprintViewSet):
"""FingerPrintHub 指纹管理 ViewSet
继承自 BaseFingerprintViewSet提供以下 API
标准 CRUDModelViewSet
- GET / 列表查询(分页)
- POST / 创建单条
- GET /{id}/ 获取详情
- PUT /{id}/ 更新
- DELETE /{id}/ 删除
批量操作(继承自基类):
- POST /batch_create/ 批量创建JSON body
- POST /import_file/ 文件导入multipart/form-data
- POST /bulk-delete/ 批量删除
- POST /delete-all/ 删除所有
- GET /export/ 导出下载
智能过滤语法filter 参数):
- name="word" 模糊匹配 name 字段
- fp_id=="xxx" 精确匹配指纹ID
- author="xxx" 按作者筛选
- severity="info" 按严重程度筛选
- tags="cms" 按标签筛选
"""
queryset = FingerPrintHubFingerprint.objects.all()
serializer_class = FingerPrintHubFingerprintSerializer
pagination_class = BasePagination
service_class = FingerPrintHubFingerprintService
# 排序配置
ordering_fields = ['created_at', 'name', 'severity']
ordering = ['-created_at']
# FingerPrintHub 过滤字段映射
FILTER_FIELD_MAPPING = {
'fp_id': 'fp_id',
'name': 'name',
'author': 'author',
'tags': 'tags',
'severity': 'severity',
'source_file': 'source_file',
}
# JSON 数组字段(使用 __contains 查询)
JSON_ARRAY_FIELDS = ['http']
def parse_import_data(self, json_data) -> list:
"""
解析 FingerPrintHub JSON 格式的导入数据
输入格式:[{...}, {...}] 数组格式
返回:指纹列表
"""
if isinstance(json_data, list):
return json_data
return []
def get_export_filename(self) -> str:
"""导出文件名"""
return 'fingerprinthub_web.json'

View File

@@ -0,0 +1,69 @@
"""Fingers 指纹管理 ViewSet"""
from apps.common.pagination import BasePagination
from apps.engine.models import FingersFingerprint
from apps.engine.serializers.fingerprints import FingersFingerprintSerializer
from apps.engine.services.fingerprints import FingersFingerprintService
from .base import BaseFingerprintViewSet
class FingersFingerprintViewSet(BaseFingerprintViewSet):
"""Fingers 指纹管理 ViewSet
继承自 BaseFingerprintViewSet提供以下 API
标准 CRUDModelViewSet
- GET / 列表查询(分页)
- POST / 创建单条
- GET /{id}/ 获取详情
- PUT /{id}/ 更新
- DELETE /{id}/ 删除
批量操作(继承自基类):
- POST /batch_create/ 批量创建JSON body
- POST /import_file/ 文件导入multipart/form-data
- POST /bulk-delete/ 批量删除
- POST /delete-all/ 删除所有
- GET /export/ 导出下载
智能过滤语法filter 参数):
- name="word" 模糊匹配 name 字段
- name=="WordPress" 精确匹配
- tag="cms" 按标签筛选
- focus="true" 按重点关注筛选
"""
queryset = FingersFingerprint.objects.all()
serializer_class = FingersFingerprintSerializer
pagination_class = BasePagination
service_class = FingersFingerprintService
# 排序配置
ordering_fields = ['created_at', 'name']
ordering = ['-created_at']
# Fingers 过滤字段映射
FILTER_FIELD_MAPPING = {
'name': 'name',
'link': 'link',
'focus': 'focus',
}
# JSON 数组字段(使用 __contains 查询)
JSON_ARRAY_FIELDS = ['tag', 'rule', 'default_port']
def parse_import_data(self, json_data) -> list:
"""
解析 Fingers JSON 格式的导入数据
输入格式:[{...}, {...}] 数组格式
返回:指纹列表
"""
if isinstance(json_data, list):
return json_data
return []
def get_export_filename(self) -> str:
"""导出文件名"""
return 'fingers_http.json'

View File

@@ -31,6 +31,8 @@ from rest_framework.decorators import action
from rest_framework.request import Request
from rest_framework.response import Response
from apps.common.response_helpers import success_response, error_response
from apps.common.error_codes import ErrorCodes
from apps.engine.models import NucleiTemplateRepo
from apps.engine.serializers import NucleiTemplateRepoSerializer
from apps.engine.services import NucleiTemplateRepoService
@@ -107,18 +109,30 @@ class NucleiTemplateRepoViewSet(viewsets.ModelViewSet):
try:
repo_id = int(pk) if pk is not None else None
except (TypeError, ValueError):
return Response({"message": "无效的仓库 ID"}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Invalid repository ID',
status_code=status.HTTP_400_BAD_REQUEST
)
# 调用 Service 层
try:
result = self.service.refresh_repo(repo_id)
except ValidationError as exc:
return Response({"message": str(exc)}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message=str(exc),
status_code=status.HTTP_400_BAD_REQUEST
)
except Exception as exc: # noqa: BLE001
logger.error("刷新 Nuclei 模板仓库失败: %s", exc, exc_info=True)
return Response({"message": f"刷新仓库失败: {exc}"}, status=status.HTTP_500_INTERNAL_SERVER_ERROR)
return error_response(
code=ErrorCodes.SERVER_ERROR,
message=f'Refresh failed: {exc}',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
return Response({"message": "刷新成功", "result": result}, status=status.HTTP_200_OK)
return success_response(data={'result': result})
# ==================== 自定义 Action: 模板只读浏览 ====================
@@ -142,18 +156,30 @@ class NucleiTemplateRepoViewSet(viewsets.ModelViewSet):
try:
repo_id = int(pk) if pk is not None else None
except (TypeError, ValueError):
return Response({"message": "无效的仓库 ID"}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Invalid repository ID',
status_code=status.HTTP_400_BAD_REQUEST
)
# 调用 Service 层,仅从当前本地目录读取目录树
try:
roots = self.service.get_template_tree(repo_id)
except ValidationError as exc:
return Response({"message": str(exc)}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message=str(exc),
status_code=status.HTTP_400_BAD_REQUEST
)
except Exception as exc: # noqa: BLE001
logger.error("获取 Nuclei 模板目录树失败: %s", exc, exc_info=True)
return Response({"message": "获取模板目录树失败"}, status=status.HTTP_500_INTERNAL_SERVER_ERROR)
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Failed to get template tree',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
return Response({"roots": roots})
return success_response(data={'roots': roots})
@action(detail=True, methods=["get"], url_path="templates/content")
def templates_content(self, request: Request, pk: str | None = None) -> Response:
@@ -174,23 +200,43 @@ class NucleiTemplateRepoViewSet(viewsets.ModelViewSet):
try:
repo_id = int(pk) if pk is not None else None
except (TypeError, ValueError):
return Response({"message": "无效的仓库 ID"}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Invalid repository ID',
status_code=status.HTTP_400_BAD_REQUEST
)
# 解析 path 参数
rel_path = (request.query_params.get("path", "") or "").strip()
if not rel_path:
return Response({"message": "缺少 path 参数"}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Missing path parameter',
status_code=status.HTTP_400_BAD_REQUEST
)
# 调用 Service 层
try:
result = self.service.get_template_content(repo_id, rel_path)
except ValidationError as exc:
return Response({"message": str(exc)}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message=str(exc),
status_code=status.HTTP_400_BAD_REQUEST
)
except Exception as exc: # noqa: BLE001
logger.error("获取 Nuclei 模板内容失败: %s", exc, exc_info=True)
return Response({"message": "获取模板内容失败"}, status=status.HTTP_500_INTERNAL_SERVER_ERROR)
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Failed to get template content',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
# 文件不存在
if result is None:
return Response({"message": "模板不存在或无法读取"}, status=status.HTTP_404_NOT_FOUND)
return Response(result)
return error_response(
code=ErrorCodes.NOT_FOUND,
message='Template not found or unreadable',
status_code=status.HTTP_404_NOT_FOUND
)
return success_response(data=result)

View File

@@ -9,6 +9,8 @@ from rest_framework.decorators import action
from rest_framework.response import Response
from apps.common.pagination import BasePagination
from apps.common.response_helpers import success_response, error_response
from apps.common.error_codes import ErrorCodes
from apps.engine.serializers.wordlist_serializer import WordlistSerializer
from apps.engine.services.wordlist_service import WordlistService
@@ -46,7 +48,11 @@ class WordlistViewSet(viewsets.ViewSet):
uploaded_file = request.FILES.get("file")
if not uploaded_file:
return Response({"error": "缺少字典文件"}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Missing wordlist file',
status_code=status.HTTP_400_BAD_REQUEST
)
try:
wordlist = self.service.create_wordlist(
@@ -55,21 +61,32 @@ class WordlistViewSet(viewsets.ViewSet):
uploaded_file=uploaded_file,
)
except ValidationError as exc:
return Response({"error": str(exc)}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message=str(exc),
status_code=status.HTTP_400_BAD_REQUEST
)
serializer = WordlistSerializer(wordlist)
return Response(serializer.data, status=status.HTTP_201_CREATED)
return success_response(data=serializer.data, status_code=status.HTTP_201_CREATED)
def destroy(self, request, pk=None):
"""删除字典记录"""
try:
wordlist_id = int(pk)
except (TypeError, ValueError):
return Response({"error": "无效的 ID"}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Invalid ID',
status_code=status.HTTP_400_BAD_REQUEST
)
success = self.service.delete_wordlist(wordlist_id)
if not success:
return Response({"error": "字典不存在"}, status=status.HTTP_404_NOT_FOUND)
return error_response(
code=ErrorCodes.NOT_FOUND,
status_code=status.HTTP_404_NOT_FOUND
)
return Response(status=status.HTTP_204_NO_CONTENT)
@@ -82,15 +99,27 @@ class WordlistViewSet(viewsets.ViewSet):
"""
name = (request.query_params.get("wordlist", "") or "").strip()
if not name:
return Response({"error": "缺少参数 wordlist"}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Missing parameter: wordlist',
status_code=status.HTTP_400_BAD_REQUEST
)
wordlist = self.service.get_wordlist_by_name(name)
if not wordlist:
return Response({"error": "字典不存在"}, status=status.HTTP_404_NOT_FOUND)
return error_response(
code=ErrorCodes.NOT_FOUND,
message='Wordlist not found',
status_code=status.HTTP_404_NOT_FOUND
)
file_path = wordlist.file_path
if not file_path or not os.path.exists(file_path):
return Response({"error": "字典文件不存在"}, status=status.HTTP_404_NOT_FOUND)
return error_response(
code=ErrorCodes.NOT_FOUND,
message='Wordlist file not found',
status_code=status.HTTP_404_NOT_FOUND
)
filename = os.path.basename(file_path)
response = FileResponse(open(file_path, "rb"), as_attachment=True, filename=filename)
@@ -106,22 +135,38 @@ class WordlistViewSet(viewsets.ViewSet):
try:
wordlist_id = int(pk)
except (TypeError, ValueError):
return Response({"error": "无效的 ID"}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Invalid ID',
status_code=status.HTTP_400_BAD_REQUEST
)
if request.method == "GET":
content = self.service.get_wordlist_content(wordlist_id)
if content is None:
return Response({"error": "字典不存在或文件无法读取"}, status=status.HTTP_404_NOT_FOUND)
return Response({"content": content})
return error_response(
code=ErrorCodes.NOT_FOUND,
message='Wordlist not found or file unreadable',
status_code=status.HTTP_404_NOT_FOUND
)
return success_response(data={"content": content})
elif request.method == "PUT":
content = request.data.get("content")
if content is None:
return Response({"error": "缺少 content 参数"}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Missing content parameter',
status_code=status.HTTP_400_BAD_REQUEST
)
wordlist = self.service.update_wordlist_content(wordlist_id, content)
if not wordlist:
return Response({"error": "字典不存在或更新失败"}, status=status.HTTP_404_NOT_FOUND)
return error_response(
code=ErrorCodes.NOT_FOUND,
message='Wordlist not found or update failed',
status_code=status.HTTP_404_NOT_FOUND
)
serializer = WordlistSerializer(wordlist)
return Response(serializer.data)
return success_response(data=serializer.data)

View File

@@ -9,6 +9,8 @@ from rest_framework import viewsets, status
from rest_framework.decorators import action
from rest_framework.response import Response
from apps.common.response_helpers import success_response, error_response
from apps.common.error_codes import ErrorCodes
from apps.engine.serializers import WorkerNodeSerializer
from apps.engine.services import WorkerService
from apps.common.signals import worker_delete_failed
@@ -111,9 +113,8 @@ class WorkerNodeViewSet(viewsets.ModelViewSet):
threading.Thread(target=_async_remote_uninstall, daemon=True).start()
# 3. 立即返回成功
return Response(
{"message": f"节点 {worker_name} 已删除"},
status=status.HTTP_200_OK
return success_response(
data={'name': worker_name}
)
@action(detail=True, methods=['post'])
@@ -190,11 +191,13 @@ class WorkerNodeViewSet(viewsets.ModelViewSet):
worker.status = 'online'
worker.save(update_fields=['status'])
return Response({
'status': 'ok',
'need_update': need_update,
'server_version': server_version
})
return success_response(
data={
'status': 'ok',
'needUpdate': need_update,
'serverVersion': server_version
}
)
def _trigger_remote_agent_update(self, worker, target_version: str):
"""
@@ -304,9 +307,10 @@ class WorkerNodeViewSet(viewsets.ModelViewSet):
is_local = request.data.get('is_local', True)
if not name:
return Response(
{'error': '缺少 name 参数'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Missing name parameter',
status_code=status.HTTP_400_BAD_REQUEST
)
worker, created = self.worker_service.register_worker(
@@ -314,11 +318,13 @@ class WorkerNodeViewSet(viewsets.ModelViewSet):
is_local=is_local
)
return Response({
'worker_id': worker.id,
'name': worker.name,
'created': created
})
return success_response(
data={
'workerId': worker.id,
'name': worker.name,
'created': created
}
)
@action(detail=False, methods=['get'])
def config(self, request):
@@ -334,13 +340,12 @@ class WorkerNodeViewSet(viewsets.ModelViewSet):
返回:
{
"db": {"host": "...", "port": "...", ...},
"redisUrl": "...",
"paths": {"results": "...", "logs": "..."}
}
配置逻辑:
- 本地 Worker (is_local=true): db_host=postgres, redis=redis:6379
- 远程 Worker (is_local=false): db_host=PUBLIC_HOST, redis=PUBLIC_HOST:6379
- 本地 Worker (is_local=true): db_host=postgres
- 远程 Worker (is_local=false): db_host=PUBLIC_HOST
"""
from django.conf import settings
import logging
@@ -365,39 +370,35 @@ class WorkerNodeViewSet(viewsets.ModelViewSet):
if is_local_worker:
# 本地 Worker直接用 Docker 内部服务名
worker_db_host = 'postgres'
worker_redis_url = 'redis://redis:6379/0'
else:
# 远程 Worker通过公网 IP 访问
public_host = settings.PUBLIC_HOST
if public_host in ('server', 'localhost', '127.0.0.1'):
logger.warning("远程 Worker 请求配置,但 PUBLIC_HOST=%s 不是有效的公网地址", public_host)
worker_db_host = public_host
worker_redis_url = f'redis://{public_host}:6379/0'
else:
# 远程数据库场景:所有 Worker 都用 DB_HOST
worker_db_host = db_host
worker_redis_url = getattr(settings, 'WORKER_REDIS_URL', 'redis://redis:6379/0')
logger.info("返回 Worker 配置 - db_host: %s, redis_url: %s", worker_db_host, worker_redis_url)
logger.info("返回 Worker 配置 - db_host: %s", worker_db_host)
return Response({
'db': {
'host': worker_db_host,
'port': str(settings.DATABASES['default']['PORT']),
'name': settings.DATABASES['default']['NAME'],
'user': settings.DATABASES['default']['USER'],
'password': settings.DATABASES['default']['PASSWORD'],
},
'redisUrl': worker_redis_url,
'paths': {
'results': getattr(settings, 'CONTAINER_RESULTS_MOUNT', '/opt/xingrin/results'),
'logs': getattr(settings, 'CONTAINER_LOGS_MOUNT', '/opt/xingrin/logs'),
},
'logging': {
'level': os.getenv('LOG_LEVEL', 'INFO'),
'enableCommandLogging': os.getenv('ENABLE_COMMAND_LOGGING', 'true').lower() == 'true',
},
'debug': settings.DEBUG,
# Git 加速配置(用于 Git clone 加速,如 Nuclei 模板仓库)
'gitMirror': settings.GIT_MIRROR,
})
return success_response(
data={
'db': {
'host': worker_db_host,
'port': str(settings.DATABASES['default']['PORT']),
'name': settings.DATABASES['default']['NAME'],
'user': settings.DATABASES['default']['USER'],
'password': settings.DATABASES['default']['PASSWORD'],
},
'paths': {
'results': getattr(settings, 'CONTAINER_RESULTS_MOUNT', '/opt/xingrin/results'),
'logs': getattr(settings, 'CONTAINER_LOGS_MOUNT', '/opt/xingrin/logs'),
},
'logging': {
'level': os.getenv('LOG_LEVEL', 'INFO'),
'enableCommandLogging': os.getenv('ENABLE_COMMAND_LOGGING', 'true').lower() == 'true',
},
'debug': settings.DEBUG,
}
)

View File

@@ -7,7 +7,7 @@
from django.conf import settings
# ==================== 路径配置 ====================
SCAN_TOOLS_BASE_PATH = getattr(settings, 'SCAN_TOOLS_BASE_PATH', '/opt/xingrin/tools')
SCAN_TOOLS_BASE_PATH = getattr(settings, 'SCAN_TOOLS_BASE_PATH', '/usr/local/bin')
# ==================== 子域名发现 ====================
@@ -35,7 +35,7 @@ SUBDOMAIN_DISCOVERY_COMMANDS = {
},
'sublist3r': {
'base': "python3 '{scan_tools_base}/Sublist3r/sublist3r.py' -d {domain} -o '{output_file}'",
'base': "python3 '/usr/local/share/Sublist3r/sublist3r.py' -d {domain} -o '{output_file}'",
'optional': {
'threads': '-t {threads}'
}
@@ -97,9 +97,11 @@ SITE_SCAN_COMMANDS = {
'base': (
"'{scan_tools_base}/httpx' -l '{url_file}' "
'-status-code -content-type -content-length '
'-location -title -server -body-preview '
'-location -title -server '
'-tech-detect -cdn -vhost '
'-random-agent -no-color -json'
'-include-response '
'-rstr 2000 '
'-random-agent -no-color -json -silent'
),
'optional': {
'threads': '-threads {threads}',
@@ -115,7 +117,7 @@ SITE_SCAN_COMMANDS = {
DIRECTORY_SCAN_COMMANDS = {
'ffuf': {
'base': "ffuf -u '{url}FUZZ' -se -ac -sf -json -w '{wordlist}'",
'base': "'{scan_tools_base}/ffuf' -u '{url}FUZZ' -se -ac -sf -json -w '{wordlist}'",
'optional': {
'delay': '-p {delay}',
'threads': '-t {threads}',
@@ -169,9 +171,11 @@ URL_FETCH_COMMANDS = {
'base': (
"'{scan_tools_base}/httpx' -l '{url_file}' "
'-status-code -content-type -content-length '
'-location -title -server -body-preview '
'-location -title -server '
'-tech-detect -cdn -vhost '
'-random-agent -no-color -json'
'-include-response '
'-rstr 2000 '
'-random-agent -no-color -json -silent'
),
'optional': {
'threads': '-threads {threads}',
@@ -239,6 +243,9 @@ FINGERPRINT_DETECT_COMMANDS = {
'ehole': '--ehole {ehole}',
'goby': '--goby {goby}',
'wappalyzer': '--wappalyzer {wappalyzer}',
'fingers': '--fingers {fingers}',
'fingerprinthub': '--fingerprint {fingerprinthub}',
'arl': '--arl {arl}',
}
},
}

View File

@@ -4,14 +4,12 @@
# 必需参数enabled是否启用
# 可选参数timeout超时秒数默认 auto 自动计算)
# ==================== 子域名发现 ====================
#
# Stage 1: 被动收集(并行) - 必选,至少启用一个工具
# Stage 2: 字典爆破(可选) - 使用字典暴力枚举子域名
# Stage 3: 变异生成 + 验证(可选) - 基于已发现域名生成变异,流式验证存活
# Stage 4: DNS 存活验证(可选) - 验证所有候选域名是否能解析
#
subdomain_discovery:
# ==================== 子域名发现 ====================
# Stage 1: 被动收集(并行) - 必选,至少启用一个工具
# Stage 2: 字典爆破(可选) - 使用字典暴力枚举子域名
# Stage 3: 变异生成 + 验证(可选) - 基于已发现域名生成变异,流式验证存活
# Stage 4: DNS 存活验证(可选) - 验证所有候选域名是否能解析
# === Stage 1: 被动收集工具(并行执行)===
passive_tools:
subfinder:
@@ -53,10 +51,10 @@ subdomain_discovery:
resolve:
enabled: true
subdomain_resolve:
# timeout: auto # 自动根据候选子域数量计算
timeout: auto # 自动根据候选子域数量计算
# ==================== 端口扫描 ====================
port_scan:
# ==================== 端口扫描 ====================
tools:
naabu_active:
enabled: true
@@ -70,8 +68,8 @@ port_scan:
enabled: true
# timeout: auto # 被动扫描通常较快
# ==================== 站点扫描 ====================
site_scan:
# ==================== 站点扫描 ====================
tools:
httpx:
enabled: true
@@ -81,16 +79,16 @@ site_scan:
# request-timeout: 10 # 单个请求超时秒数(默认 10
# retries: 2 # 请求失败重试次数
# ==================== 指纹识别 ====================
# 在 site_scan 后串行执行,识别 WebSite 的技术栈
fingerprint_detect:
# ==================== 指纹识别 ====================
# 在 站点扫描 后串行执行,识别 WebSite 的技术栈
tools:
xingfinger:
enabled: true
fingerprint-libs: [ehole, goby, wappalyzer] # 启用的指纹库ehole, goby, wappalyzer, fingers, fingerprinthub
fingerprint-libs: [ehole, goby, wappalyzer, fingers, fingerprinthub, arl] # 默认启动全部指纹库
# ==================== 目录扫描 ====================
directory_scan:
# ==================== 目录扫描 ====================
tools:
ffuf:
enabled: true
@@ -103,8 +101,8 @@ directory_scan:
match-codes: 200,201,301,302,401,403 # 匹配的 HTTP 状态码
# rate: 0 # 每秒请求数(默认 0 不限制)
# ==================== URL 获取 ====================
url_fetch:
# ==================== URL 获取 ====================
tools:
waymore:
enabled: true
@@ -142,8 +140,8 @@ url_fetch:
# request-timeout: 10 # 单个请求超时秒数(默认 10
# retries: 2 # 请求失败重试次数
# ==================== 漏洞扫描 ====================
vuln_scan:
# ==================== 漏洞扫描 ====================
tools:
dalfox_xss:
enabled: true

View File

@@ -37,28 +37,24 @@ logger = logging.getLogger(__name__)
def calculate_fingerprint_detect_timeout(
url_count: int,
base_per_url: float = 3.0,
min_timeout: int = 60
base_per_url: float = 10.0,
min_timeout: int = 300
) -> int:
"""
根据 URL 数量计算超时时间
公式:超时时间 = URL 数量 × 每 URL 基础时间
最小值:60秒
最小值:300秒
无上限
Args:
url_count: URL 数量
base_per_url: 每 URL 基础时间(秒),默认 3
min_timeout: 最小超时时间(秒),默认 60秒
base_per_url: 每 URL 基础时间(秒),默认 10
min_timeout: 最小超时时间(秒),默认 300秒
Returns:
int: 计算出的超时时间(秒)
示例:
100 URL × 3秒 = 300秒
1000 URL × 3秒 = 3000秒50分钟
10000 URL × 3秒 = 30000秒8.3小时)
"""
timeout = int(url_count * base_per_url)
return max(min_timeout, timeout)
@@ -260,7 +256,8 @@ def fingerprint_detect_flow(
'url_count': int,
'processed_records': int,
'updated_count': int,
'not_found_count': int,
'created_count': int,
'snapshot_count': int,
'executed_tasks': list,
'tool_stats': dict
}
@@ -307,6 +304,7 @@ def fingerprint_detect_flow(
'processed_records': 0,
'updated_count': 0,
'created_count': 0,
'snapshot_count': 0,
'executed_tasks': ['export_urls_for_fingerprint'],
'tool_stats': {
'total': 0,
@@ -344,6 +342,7 @@ def fingerprint_detect_flow(
total_processed = sum(stats['result'].get('processed_records', 0) for stats in tool_stats.values())
total_updated = sum(stats['result'].get('updated_count', 0) for stats in tool_stats.values())
total_created = sum(stats['result'].get('created_count', 0) for stats in tool_stats.values())
total_snapshots = sum(stats['result'].get('snapshot_count', 0) for stats in tool_stats.values())
successful_tools = [name for name in enabled_tools.keys()
if name not in [f['tool'] for f in failed_tools]]
@@ -358,6 +357,7 @@ def fingerprint_detect_flow(
'processed_records': total_processed,
'updated_count': total_updated,
'created_count': total_created,
'snapshot_count': total_snapshots,
'executed_tasks': executed_tasks,
'tool_stats': {
'total': len(enabled_tools),

View File

@@ -114,8 +114,11 @@ def initiate_scan_flow(
# ==================== Task 2: 获取引擎配置 ====================
from apps.scan.models import Scan
scan = Scan.objects.select_related('engine').get(id=scan_id)
engine_config = scan.engine.configuration
scan = Scan.objects.get(id=scan_id)
engine_config = scan.merged_configuration
# 使用 engine_names 进行显示
display_engine_name = ', '.join(scan.engine_names) if scan.engine_names else engine_name
# ==================== Task 3: 解析配置,生成执行计划 ====================
orchestrator = FlowOrchestrator(engine_config)

View File

@@ -20,11 +20,19 @@ class Scan(models.Model):
target = models.ForeignKey('targets.Target', on_delete=models.CASCADE, related_name='scans', help_text='扫描目标')
engine = models.ForeignKey(
'engine.ScanEngine',
on_delete=models.CASCADE,
related_name='scans',
help_text='使用的扫描引擎'
# 多引擎支持字段
engine_ids = ArrayField(
models.IntegerField(),
default=list,
help_text='引擎 ID 列表'
)
engine_names = models.JSONField(
default=list,
help_text='引擎名称列表,如 ["引擎A", "引擎B"]'
)
merged_configuration = models.TextField(
default='',
help_text='合并后的 YAML 配置'
)
created_at = models.DateTimeField(auto_now_add=True, help_text='任务创建时间')
@@ -118,12 +126,19 @@ class ScheduledScan(models.Model):
# 基本信息
name = models.CharField(max_length=200, help_text='任务名称')
# 关联的扫描引擎
engine = models.ForeignKey(
'engine.ScanEngine',
on_delete=models.CASCADE,
related_name='scheduled_scans',
help_text='使用的扫描引擎'
# 多引擎支持字段
engine_ids = ArrayField(
models.IntegerField(),
default=list,
help_text='引擎 ID 列表'
)
engine_names = models.JSONField(
default=list,
help_text='引擎名称列表,如 ["引擎A", "引擎B"]'
)
merged_configuration = models.TextField(
default='',
help_text='合并后的 YAML 配置'
)
# 关联的组织(组织扫描模式:执行时动态获取组织下所有目标)

View File

@@ -5,12 +5,13 @@ WebSocket Consumer - 通知实时推送
import json
import logging
import asyncio
from channels.generic.websocket import AsyncWebsocketConsumer
from apps.common.websocket_auth import AuthenticatedWebsocketConsumer
logger = logging.getLogger(__name__)
class NotificationConsumer(AsyncWebsocketConsumer):
class NotificationConsumer(AuthenticatedWebsocketConsumer):
"""
通知 WebSocket Consumer
@@ -23,9 +24,9 @@ class NotificationConsumer(AsyncWebsocketConsumer):
super().__init__(*args, **kwargs)
self.heartbeat_task = None # 心跳任务
async def connect(self):
async def on_connect(self):
"""
客户端连接时调用
客户端连接时调用(已通过认证)
加入通知广播组
"""
# 通知组名(所有客户端共享)

View File

@@ -305,6 +305,7 @@ def _push_via_api_callback(notification: Notification, server_url: str) -> None:
通过 HTTP 请求 Server 容器的 /api/callbacks/notification/ 接口。
Worker 无法直接访问 Redis需要由 Server 代为推送 WebSocket。
"""
import os
import requests
try:
@@ -318,8 +319,14 @@ def _push_via_api_callback(notification: Notification, server_url: str) -> None:
'created_at': notification.created_at.isoformat()
}
# 构建请求头(包含 Worker API Key
headers = {'Content-Type': 'application/json'}
worker_api_key = os.environ.get("WORKER_API_KEY", "")
if worker_api_key:
headers["X-Worker-API-Key"] = worker_api_key
# verify=False: 远程 Worker 回调 Server 时可能使用自签名证书
resp = requests.post(callback_url, json=data, timeout=5, verify=False)
resp = requests.post(callback_url, json=data, headers=headers, timeout=5, verify=False)
resp.raise_for_status()
logger.debug(f"通知回调推送成功 - ID: {notification.id}")

View File

@@ -7,13 +7,14 @@ from typing import Any
from django.http import JsonResponse
from django.utils import timezone
from rest_framework import status
from rest_framework.decorators import api_view, permission_classes
from rest_framework.permissions import AllowAny
from rest_framework.decorators import api_view
from rest_framework.request import Request
from rest_framework.response import Response
from rest_framework.views import APIView
from apps.common.pagination import BasePagination
from apps.common.response_helpers import success_response, error_response
from apps.common.error_codes import ErrorCodes
from .models import Notification
from .serializers import NotificationSerializer
from .types import NotificationLevel
@@ -60,34 +61,7 @@ def notifications_test(request):
}, status=500)
def build_api_response(
data: Any = None,
*,
message: str = '操作成功',
code: str = '200',
state: str = 'success',
status_code: int = status.HTTP_200_OK
) -> Response:
"""构建统一的 API 响应格式
Args:
data: 响应数据体(可选)
message: 响应消息
code: 响应代码
state: 响应状态success/error
status_code: HTTP 状态码
Returns:
DRF Response 对象
"""
payload = {
'code': code,
'state': state,
'message': message,
}
if data is not None:
payload['data'] = data
return Response(payload, status=status_code)
# build_api_response 已废弃,请使用 success_response/error_response
def _parse_bool(value: str | None) -> bool | None:
@@ -172,7 +146,7 @@ class NotificationUnreadCountView(APIView):
"""获取未读通知数量"""
service = NotificationService()
count = service.get_unread_count()
return build_api_response({'count': count}, message='获取未读数量成功')
return success_response(data={'count': count})
class NotificationMarkAllAsReadView(APIView):
@@ -192,7 +166,7 @@ class NotificationMarkAllAsReadView(APIView):
"""标记全部通知为已读"""
service = NotificationService()
updated = service.mark_all_as_read()
return build_api_response({'updated': updated}, message='全部标记已读成功')
return success_response(data={'updated': updated})
class NotificationSettingsView(APIView):
@@ -209,13 +183,13 @@ class NotificationSettingsView(APIView):
"""获取通知设置"""
service = NotificationSettingsService()
settings = service.get_settings()
return Response(settings)
return success_response(data=settings)
def put(self, request: Request) -> Response:
"""更新通知设置"""
service = NotificationSettingsService()
settings = service.update_settings(request.data)
return Response({'message': '已保存通知设置', **settings})
return success_response(data=settings)
# ============================================
@@ -223,12 +197,13 @@ class NotificationSettingsView(APIView):
# ============================================
@api_view(['POST'])
@permission_classes([AllowAny]) # Worker 容器无认证,可考虑添加 Token 验
# 权限由全局 IsAuthenticatedOrPublic 处理,/api/callbacks/* 需要 Worker API Key 认
def notification_callback(request):
"""
接收 Worker 的通知推送请求
Worker 容器无法直接访问 Redis通过此 API 回调让 Server 推送 WebSocket。
需要 Worker API Key 认证X-Worker-API-Key Header
POST /api/callbacks/notification/
{
@@ -247,22 +222,24 @@ def notification_callback(request):
required_fields = ['id', 'category', 'title', 'message', 'level', 'created_at']
for field in required_fields:
if field not in data:
return Response(
{'error': f'缺少字段: {field}'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message=f'Missing field: {field}',
status_code=status.HTTP_400_BAD_REQUEST
)
# 推送到 WebSocket
_push_notification_to_websocket(data)
logger.debug(f"回调通知推送成功 - ID: {data['id']}, Title: {data['title']}")
return Response({'status': 'ok'})
return success_response(data={'status': 'ok'})
except Exception as e:
logger.error(f"回调通知处理失败: {e}", exc_info=True)
return Response(
{'error': str(e)},
status=status.HTTP_500_INTERNAL_SERVER_ERROR
return error_response(
code=ErrorCodes.SERVER_ERROR,
message=str(e),
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)

View File

@@ -16,7 +16,6 @@ from django.utils import timezone
from apps.scan.models import Scan
from apps.targets.models import Target
from apps.engine.models import ScanEngine
from apps.common.definitions import ScanStatus
from apps.common.decorators import auto_ensure_db_connection
@@ -40,7 +39,7 @@ class DjangoScanRepository:
Args:
scan_id: 扫描任务 ID
prefetch_relations: 是否预加载关联对象(engine, target
prefetch_relations: 是否预加载关联对象(target, worker
默认 False只在需要展示关联信息时设为 True
for_update: 是否加锁(用于更新场景)
@@ -56,7 +55,7 @@ class DjangoScanRepository:
# 预加载关联对象(性能优化:默认不加载)
if prefetch_relations:
queryset = queryset.select_related('engine', 'target')
queryset = queryset.select_related('target', 'worker')
return queryset.get(id=scan_id)
except Scan.DoesNotExist: # type: ignore # pylint: disable=no-member
@@ -79,7 +78,7 @@ class DjangoScanRepository:
Note:
- 使用默认的阻塞模式(等待锁释放)
- 不包含关联对象(engine, target),如需关联对象请使用 get_by_id()
- 不包含关联对象(target, worker),如需关联对象请使用 get_by_id()
"""
try:
return Scan.objects.select_for_update().get(id=scan_id) # type: ignore # pylint: disable=no-member
@@ -103,7 +102,9 @@ class DjangoScanRepository:
def create(self,
target: Target,
engine: ScanEngine,
engine_ids: List[int],
engine_names: List[str],
merged_configuration: str,
results_dir: str,
status: ScanStatus = ScanStatus.INITIATED
) -> Scan:
@@ -112,7 +113,9 @@ class DjangoScanRepository:
Args:
target: 扫描目标
engine: 扫描引擎
engine_ids: 引擎 ID 列表
engine_names: 引擎名称列表
merged_configuration: 合并后的 YAML 配置
results_dir: 结果目录
status: 初始状态
@@ -121,7 +124,9 @@ class DjangoScanRepository:
"""
scan = Scan(
target=target,
engine=engine,
engine_ids=engine_ids,
engine_names=engine_names,
merged_configuration=merged_configuration,
results_dir=results_dir,
status=status,
container_ids=[]
@@ -231,14 +236,14 @@ class DjangoScanRepository:
获取所有扫描任务
Args:
prefetch_relations: 是否预加载关联对象(engine, target
prefetch_relations: 是否预加载关联对象(target, worker
Returns:
Scan QuerySet
"""
queryset = Scan.objects.all() # type: ignore # pylint: disable=no-member
if prefetch_relations:
queryset = queryset.select_related('engine', 'target')
queryset = queryset.select_related('target', 'worker')
return queryset.order_by('-created_at')

View File

@@ -29,7 +29,9 @@ class ScheduledScanDTO:
"""
id: Optional[int] = None
name: str = ''
engine_id: int = 0
engine_ids: List[int] = None # 多引擎支持
engine_names: List[str] = None # 引擎名称列表
merged_configuration: str = '' # 合并后的配置
organization_id: Optional[int] = None # 组织扫描模式
target_id: Optional[int] = None # 目标扫描模式
cron_expression: Optional[str] = None
@@ -40,6 +42,11 @@ class ScheduledScanDTO:
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
def __post_init__(self):
if self.engine_ids is None:
self.engine_ids = []
if self.engine_names is None:
self.engine_names = []
@auto_ensure_db_connection
@@ -56,7 +63,7 @@ class DjangoScheduledScanRepository:
def get_by_id(self, scheduled_scan_id: int) -> Optional[ScheduledScan]:
"""根据 ID 查询定时扫描任务"""
try:
return ScheduledScan.objects.select_related('engine', 'organization', 'target').get(id=scheduled_scan_id)
return ScheduledScan.objects.select_related('organization', 'target').get(id=scheduled_scan_id)
except ScheduledScan.DoesNotExist:
return None
@@ -67,7 +74,7 @@ class DjangoScheduledScanRepository:
Returns:
QuerySet
"""
return ScheduledScan.objects.select_related('engine', 'organization', 'target').order_by('-created_at')
return ScheduledScan.objects.select_related('organization', 'target').order_by('-created_at')
def get_all(self, page: int = 1, page_size: int = 10) -> Tuple[List[ScheduledScan], int]:
"""
@@ -87,7 +94,7 @@ class DjangoScheduledScanRepository:
def get_enabled(self) -> List[ScheduledScan]:
"""获取所有启用的定时扫描任务"""
return list(
ScheduledScan.objects.select_related('engine', 'target')
ScheduledScan.objects.select_related('target')
.filter(is_enabled=True)
.order_by('-created_at')
)
@@ -105,7 +112,9 @@ class DjangoScheduledScanRepository:
with transaction.atomic():
scheduled_scan = ScheduledScan.objects.create(
name=dto.name,
engine_id=dto.engine_id,
engine_ids=dto.engine_ids,
engine_names=dto.engine_names,
merged_configuration=dto.merged_configuration,
organization_id=dto.organization_id, # 组织扫描模式
target_id=dto.target_id if not dto.organization_id else None, # 目标扫描模式
cron_expression=dto.cron_expression,
@@ -134,8 +143,12 @@ class DjangoScheduledScanRepository:
# 更新基本字段
if dto.name:
scheduled_scan.name = dto.name
if dto.engine_id:
scheduled_scan.engine_id = dto.engine_id
if dto.engine_ids is not None:
scheduled_scan.engine_ids = dto.engine_ids
if dto.engine_names is not None:
scheduled_scan.engine_names = dto.engine_names
if dto.merged_configuration is not None:
scheduled_scan.merged_configuration = dto.merged_configuration
if dto.cron_expression is not None:
scheduled_scan.cron_expression = dto.cron_expression
if dto.is_enabled is not None:

View File

@@ -7,12 +7,11 @@ from .models import Scan, ScheduledScan
class ScanSerializer(serializers.ModelSerializer):
"""扫描任务序列化器"""
target_name = serializers.SerializerMethodField()
engine_name = serializers.SerializerMethodField()
class Meta:
model = Scan
fields = [
'id', 'target', 'target_name', 'engine', 'engine_name',
'id', 'target', 'target_name', 'engine_ids', 'engine_names',
'created_at', 'stopped_at', 'status', 'results_dir',
'container_ids', 'error_message'
]
@@ -24,10 +23,6 @@ class ScanSerializer(serializers.ModelSerializer):
def get_target_name(self, obj):
"""获取目标名称"""
return obj.target.name if obj.target else None
def get_engine_name(self, obj):
"""获取引擎名称"""
return obj.engine.name if obj.engine else None
class ScanHistorySerializer(serializers.ModelSerializer):
@@ -36,11 +31,12 @@ class ScanHistorySerializer(serializers.ModelSerializer):
为前端扫描历史页面提供优化的数据格式,包括:
- 扫描汇总统计(子域名、端点、漏洞数量)
- 进度百分比和当前阶段
- 执行节点信息
"""
# 字段映射
target_name = serializers.CharField(source='target.name', read_only=True)
engine_name = serializers.CharField(source='engine.name', read_only=True)
worker_name = serializers.CharField(source='worker.name', read_only=True, allow_null=True)
# 计算字段
summary = serializers.SerializerMethodField()
@@ -53,9 +49,9 @@ class ScanHistorySerializer(serializers.ModelSerializer):
class Meta:
model = Scan
fields = [
'id', 'target', 'target_name', 'engine', 'engine_name',
'created_at', 'status', 'error_message', 'summary', 'progress',
'current_stage', 'stage_progress'
'id', 'target', 'target_name', 'engine_ids', 'engine_names',
'worker_name', 'created_at', 'status', 'error_message', 'summary',
'progress', 'current_stage', 'stage_progress'
]
def get_summary(self, obj):
@@ -105,10 +101,11 @@ class QuickScanSerializer(serializers.Serializer):
help_text='目标列表,每个目标包含 name 字段'
)
# 扫描引擎 ID
engine_id = serializers.IntegerField(
# 扫描引擎 ID 列表
engine_ids = serializers.ListField(
child=serializers.IntegerField(),
required=True,
help_text='使用的扫描引擎 ID (必填)'
help_text='使用的扫描引擎 ID 列表 (必填)'
)
def validate_targets(self, value):
@@ -130,6 +127,12 @@ class QuickScanSerializer(serializers.Serializer):
raise serializers.ValidationError(f"{idx + 1} 个目标的 name 不能为空")
return value
def validate_engine_ids(self, value):
"""验证引擎 ID 列表"""
if not value:
raise serializers.ValidationError("engine_ids 不能为空")
return value
# ==================== 定时扫描序列化器 ====================
@@ -138,7 +141,6 @@ class ScheduledScanSerializer(serializers.ModelSerializer):
"""定时扫描任务序列化器(用于列表和详情)"""
# 关联字段
engine_name = serializers.CharField(source='engine.name', read_only=True)
organization_id = serializers.IntegerField(source='organization.id', read_only=True, allow_null=True)
organization_name = serializers.CharField(source='organization.name', read_only=True, allow_null=True)
target_id = serializers.IntegerField(source='target.id', read_only=True, allow_null=True)
@@ -149,7 +151,7 @@ class ScheduledScanSerializer(serializers.ModelSerializer):
model = ScheduledScan
fields = [
'id', 'name',
'engine', 'engine_name',
'engine_ids', 'engine_names',
'organization_id', 'organization_name',
'target_id', 'target_name',
'scan_mode',
@@ -178,7 +180,10 @@ class CreateScheduledScanSerializer(serializers.Serializer):
"""
name = serializers.CharField(max_length=200, help_text='任务名称')
engine_id = serializers.IntegerField(help_text='扫描引擎 ID')
engine_ids = serializers.ListField(
child=serializers.IntegerField(),
help_text='扫描引擎 ID 列表'
)
# 组织扫描模式
organization_id = serializers.IntegerField(
@@ -201,6 +206,12 @@ class CreateScheduledScanSerializer(serializers.Serializer):
)
is_enabled = serializers.BooleanField(default=True, help_text='是否立即启用')
def validate_engine_ids(self, value):
"""验证引擎 ID 列表"""
if not value:
raise serializers.ValidationError("engine_ids 不能为空")
return value
def validate(self, data):
"""验证 organization_id 和 target_id 互斥"""
organization_id = data.get('organization_id')
@@ -219,7 +230,11 @@ class UpdateScheduledScanSerializer(serializers.Serializer):
"""更新定时扫描任务序列化器"""
name = serializers.CharField(max_length=200, required=False, help_text='任务名称')
engine_id = serializers.IntegerField(required=False, help_text='扫描引擎 ID')
engine_ids = serializers.ListField(
child=serializers.IntegerField(),
required=False,
help_text='扫描引擎 ID 列表'
)
# 组织扫描模式
organization_id = serializers.IntegerField(
@@ -237,6 +252,12 @@ class UpdateScheduledScanSerializer(serializers.Serializer):
cron_expression = serializers.CharField(max_length=100, required=False, help_text='Cron 表达式')
is_enabled = serializers.BooleanField(required=False, help_text='是否启用')
def validate_engine_ids(self, value):
"""验证引擎 ID 列表"""
if value is not None and not value:
raise serializers.ValidationError("engine_ids 不能为空")
return value
class ToggleScheduledScanSerializer(serializers.Serializer):

View File

@@ -27,9 +27,6 @@ class BlacklistService:
DEFAULT_PATTERNS = [
r'\.gov$', # .gov 结尾
r'\.gov\.[a-z]{2}$', # .gov.cn, .gov.uk 等
r'\.edu$', # .edu 结尾
r'\.edu\.[a-z]{2}$', # .edu.cn 等
r'\.mil$', # .mil 结尾
]
def __init__(self, patterns: Optional[List[str]] = None):

View File

@@ -10,7 +10,7 @@
import uuid
import logging
import threading
from typing import List
from typing import List, Tuple
from datetime import datetime
from pathlib import Path
from django.conf import settings
@@ -20,6 +20,7 @@ from django.core.exceptions import ValidationError, ObjectDoesNotExist
from apps.scan.models import Scan
from apps.scan.repositories import DjangoScanRepository
from apps.scan.utils.config_merger import merge_engine_configs, ConfigConflictError
from apps.targets.repositories import DjangoTargetRepository, DjangoOrganizationRepository
from apps.engine.repositories import DjangoEngineRepository
from apps.targets.models import Target
@@ -142,6 +143,106 @@ class ScanCreationService:
return targets, engine
def prepare_initiate_scan_multi_engine(
self,
organization_id: int | None = None,
target_id: int | None = None,
engine_ids: List[int] | None = None
) -> Tuple[List[Target], str, List[str], List[int]]:
"""
准备多引擎扫描任务所需的数据
职责:
1. 参数验证(必填项、互斥参数)
2. 资源查询Engines、Organization、Target
3. 合并引擎配置(检测冲突)
4. 返回准备好的目标列表、合并配置和引擎信息
Args:
organization_id: 组织ID可选
target_id: 目标ID可选
engine_ids: 扫描引擎ID列表必填
Returns:
(目标列表, 合并配置, 引擎名称列表, 引擎ID列表) - 供 create_scans 方法使用
Raises:
ValidationError: 参数验证失败或业务规则不满足
ObjectDoesNotExist: 资源不存在Organization/Target/ScanEngine
ConfigConflictError: 引擎配置存在冲突
Note:
- organization_id 和 target_id 必须二选一
- 如果提供 organization_id返回该组织下所有目标
- 如果提供 target_id返回单个目标列表
"""
# 1. 参数验证
if not engine_ids:
raise ValidationError('缺少必填参数: engine_ids')
if not organization_id and not target_id:
raise ValidationError('必须提供 organization_id 或 target_id 其中之一')
if organization_id and target_id:
raise ValidationError('organization_id 和 target_id 只能提供其中之一')
# 2. 查询所有扫描引擎
engines = []
for engine_id in engine_ids:
engine = self.engine_repo.get_by_id(engine_id)
if not engine:
logger.error("扫描引擎不存在 - Engine ID: %s", engine_id)
raise ObjectDoesNotExist(f'ScanEngine ID {engine_id} 不存在')
engines.append(engine)
# 3. 合并引擎配置(可能抛出 ConfigConflictError
engine_configs = [(e.name, e.configuration or '') for e in engines]
merged_configuration = merge_engine_configs(engine_configs)
engine_names = [e.name for e in engines]
logger.debug(
"引擎配置合并成功 - 引擎: %s",
', '.join(engine_names)
)
# 4. 根据参数获取目标列表
targets = []
if organization_id:
# 根据组织ID获取所有目标
organization = self.organization_repo.get_by_id(organization_id)
if not organization:
logger.error("组织不存在 - Organization ID: %s", organization_id)
raise ObjectDoesNotExist(f'Organization ID {organization_id} 不存在')
targets = self.organization_repo.get_targets(organization_id)
if not targets:
raise ValidationError(f'组织 ID {organization_id} 下没有目标')
logger.debug(
"准备发起扫描 - 组织: %s, 目标数量: %d, 引擎: %s",
organization.name,
len(targets),
', '.join(engine_names)
)
else:
# 根据目标ID获取单个目标
target = self.target_repo.get_by_id(target_id)
if not target:
logger.error("目标不存在 - Target ID: %s", target_id)
raise ObjectDoesNotExist(f'Target ID {target_id} 不存在')
targets = [target]
logger.debug(
"准备发起扫描 - 目标: %s, 引擎: %s",
target.name,
', '.join(engine_names)
)
return targets, merged_configuration, engine_names, engine_ids
def _generate_scan_workspace_dir(self) -> str:
"""
生成 Scan 工作空间目录路径
@@ -179,7 +280,9 @@ class ScanCreationService:
def create_scans(
self,
targets: List[Target],
engine: ScanEngine,
engine_ids: List[int],
engine_names: List[str],
merged_configuration: str,
scheduled_scan_name: str | None = None
) -> List[Scan]:
"""
@@ -187,7 +290,9 @@ class ScanCreationService:
Args:
targets: 目标列表
engine: 扫描引擎对象
engine_ids: 引擎 ID 列表
engine_names: 引擎名称列表
merged_configuration: 合并后的 YAML 配置
scheduled_scan_name: 定时扫描任务名称(可选,用于通知显示)
Returns:
@@ -205,7 +310,9 @@ class ScanCreationService:
scan_workspace_dir = self._generate_scan_workspace_dir()
scan = Scan(
target=target,
engine=engine,
engine_ids=engine_ids,
engine_names=engine_names,
merged_configuration=merged_configuration,
results_dir=scan_workspace_dir,
status=ScanStatus.INITIATED,
container_ids=[],
@@ -236,13 +343,15 @@ class ScanCreationService:
return []
# 第三步:分发任务到 Workers
# 使用第一个引擎名称作为显示名称,或者合并显示
display_engine_name = ', '.join(engine_names) if engine_names else ''
scan_data = [
{
'scan_id': scan.id,
'target_name': scan.target.name,
'target_id': scan.target.id,
'results_dir': scan.results_dir,
'engine_name': scan.engine.name,
'engine_name': display_engine_name,
'scheduled_scan_name': scheduled_scan_name,
}
for scan in created_scans

View File

@@ -96,14 +96,34 @@ class ScanService:
organization_id, target_id, engine_id
)
def prepare_initiate_scan_multi_engine(
self,
organization_id: int | None = None,
target_id: int | None = None,
engine_ids: List[int] | None = None
) -> tuple[List[Target], str, List[str], List[int]]:
"""
为创建多引擎扫描任务做准备
Returns:
(目标列表, 合并配置, 引擎名称列表, 引擎ID列表)
"""
return self.creation_service.prepare_initiate_scan_multi_engine(
organization_id, target_id, engine_ids
)
def create_scans(
self,
targets: List[Target],
engine: ScanEngine,
engine_ids: List[int],
engine_names: List[str],
merged_configuration: str,
scheduled_scan_name: str | None = None
) -> List[Scan]:
"""批量创建扫描任务(委托给 ScanCreationService"""
return self.creation_service.create_scans(targets, engine, scheduled_scan_name)
return self.creation_service.create_scans(
targets, engine_ids, engine_names, merged_configuration, scheduled_scan_name
)
# ==================== 状态管理方法(委托给 ScanStateService ====================

View File

@@ -14,6 +14,7 @@ from django.core.exceptions import ValidationError
from apps.scan.models import ScheduledScan
from apps.scan.repositories import DjangoScheduledScanRepository, ScheduledScanDTO
from apps.scan.utils.config_merger import merge_engine_configs, ConfigConflictError
from apps.engine.repositories import DjangoEngineRepository
from apps.targets.services import TargetService
@@ -57,8 +58,9 @@ class ScheduledScanService:
流程:
1. 验证参数
2. 创建数据库记录
3. 计算并设置 next_run_time
2. 合并引擎配置
3. 创建数据库记录
4. 计算并设置 next_run_time
Args:
dto: 定时扫描 DTO
@@ -68,14 +70,30 @@ class ScheduledScanService:
Raises:
ValidationError: 参数验证失败
ConfigConflictError: 引擎配置冲突
"""
# 1. 验证参数
self._validate_create_dto(dto)
# 2. 创建数据库记录
# 2. 合并引擎配置
engines = []
engine_names = []
for engine_id in dto.engine_ids:
engine = self.engine_repo.get_by_id(engine_id)
if engine:
engines.append((engine.name, engine.configuration or ''))
engine_names.append(engine.name)
merged_configuration = merge_engine_configs(engines)
# 设置 DTO 的合并配置和引擎名称
dto.engine_names = engine_names
dto.merged_configuration = merged_configuration
# 3. 创建数据库记录
scheduled_scan = self.repo.create(dto)
# 3. 如果有 cron 表达式且已启用,计算下次执行时间
# 4. 如果有 cron 表达式且已启用,计算下次执行时间
if scheduled_scan.cron_expression and scheduled_scan.is_enabled:
next_run_time = self._calculate_next_run_time(scheduled_scan)
if next_run_time:
@@ -96,11 +114,13 @@ class ScheduledScanService:
if not dto.name:
raise ValidationError('任务名称不能为空')
if not dto.engine_id:
if not dto.engine_ids:
raise ValidationError('必须选择扫描引擎')
if not self.engine_repo.get_by_id(dto.engine_id):
raise ValidationError(f'扫描引擎 ID {dto.engine_id} 不存在')
# 验证所有引擎是否存在
for engine_id in dto.engine_ids:
if not self.engine_repo.get_by_id(engine_id):
raise ValidationError(f'扫描引擎 ID {engine_id} 不存在')
# 验证扫描模式organization_id 和 target_id 互斥)
if not dto.organization_id and not dto.target_id:
@@ -138,11 +158,28 @@ class ScheduledScanService:
Returns:
更新后的 ScheduledScan 对象
Raises:
ConfigConflictError: 引擎配置冲突
"""
existing = self.repo.get_by_id(scheduled_scan_id)
if not existing:
return None
# 如果引擎变更,重新合并配置
if dto.engine_ids is not None:
engines = []
engine_names = []
for engine_id in dto.engine_ids:
engine = self.engine_repo.get_by_id(engine_id)
if engine:
engines.append((engine.name, engine.configuration or ''))
engine_names.append(engine.name)
merged_configuration = merge_engine_configs(engines)
dto.engine_names = engine_names
dto.merged_configuration = merged_configuration
# 更新数据库记录
scheduled_scan = self.repo.update(scheduled_scan_id, dto)
if not scheduled_scan:
@@ -292,21 +329,25 @@ class ScheduledScanService:
立即触发扫描(支持组织扫描和目标扫描两种模式)
复用 ScanService 的逻辑,与 API 调用保持一致。
使用存储的 merged_configuration 而不是重新合并。
"""
from apps.scan.services.scan_service import ScanService
scan_service = ScanService()
# 1. 准备扫描所需数据(复用 API 的逻辑
targets, engine = scan_service.prepare_initiate_scan(
# 1. 准备扫描所需数据(使用存储的多引擎配置
targets, _, _, _ = scan_service.prepare_initiate_scan_multi_engine(
organization_id=scheduled_scan.organization_id,
target_id=scheduled_scan.target_id,
engine_id=scheduled_scan.engine_id
engine_ids=scheduled_scan.engine_ids
)
# 2. 创建扫描任务,传递定时扫描名称用于通知显示
# 2. 创建扫描任务,使用存储的合并配置
created_scans = scan_service.create_scans(
targets, engine,
targets=targets,
engine_ids=scheduled_scan.engine_ids,
engine_names=scheduled_scan.engine_names,
merged_configuration=scheduled_scan.merged_configuration,
scheduled_scan_name=scheduled_scan.name
)

View File

@@ -4,7 +4,6 @@ xingfinger 执行任务
流式执行 xingfinger 命令并实时更新 tech 字段
"""
import importlib
import json
import logging
import subprocess
@@ -15,93 +14,97 @@ from django.db import connection
from prefect import task
from apps.scan.utils import execute_stream
from apps.asset.dtos.snapshot import WebsiteSnapshotDTO
from apps.asset.repositories.snapshot import DjangoWebsiteSnapshotRepository
logger = logging.getLogger(__name__)
# 数据源映射source → (module_path, model_name, url_field)
SOURCE_MODEL_MAP = {
'website': ('apps.asset.models', 'WebSite', 'url'),
# 以后扩展:
# 'endpoint': ('apps.asset.models', 'Endpoint', 'url'),
# 'directory': ('apps.asset.models', 'Directory', 'url'),
}
def _get_model_class(source: str):
"""根据数据源类型获取 Model 类"""
if source not in SOURCE_MODEL_MAP:
raise ValueError(f"不支持的数据源: {source}")
module_path, model_name, _ = SOURCE_MODEL_MAP[source]
module = importlib.import_module(module_path)
return getattr(module, model_name)
def parse_xingfinger_line(line: str) -> tuple[str, list[str]] | None:
def parse_xingfinger_line(line: str) -> dict | None:
"""
解析 xingfinger 单行 JSON 输出
xingfinger 静默模式输出格式:
{"url": "https://example.com", "cms": "WordPress,PHP,nginx", ...}
xingfinger 输出格式:
{"url": "...", "cms": "...", "server": "BWS/1.1", "status_code": 200, "length": 642831, "title": "..."}
Returns:
tuple: (url, tech_list) 或 None解析失败时
dict: 包含 url, techs, server, title, status_code, content_length 的字典
None: 解析失败或 URL 为空时
"""
try:
item = json.loads(line)
url = item.get('url', '').strip()
cms = item.get('cms', '')
if not url or not cms:
if not url:
return None
# cms 字段按逗号分割,去除空白
techs = [t.strip() for t in cms.split(',') if t.strip()]
cms = item.get('cms', '')
techs = [t.strip() for t in cms.split(',') if t.strip()] if cms else []
return (url, techs) if techs else None
return {
'url': url,
'techs': techs,
'server': item.get('server', ''),
'title': item.get('title', ''),
'status_code': item.get('status_code'),
'content_length': item.get('length'),
}
except json.JSONDecodeError:
return None
def bulk_merge_tech_field(
source: str,
url_techs_map: dict[str, list[str]],
def bulk_merge_website_fields(
records: list[dict],
target_id: int
) -> dict:
"""
批量合并 tech 数组字段PostgreSQL 原生 SQL
批量合并更新 WebSite 字段PostgreSQL 原生 SQL
合并策略:
- tech数组合并去重
- title, webserver, status_code, content_length只在原值为空/NULL 时更新
使用 PostgreSQL 原生 SQL 实现高效的数组合并去重操作。
如果 URL 对应的记录不存在,会自动创建新记录。
Args:
records: 解析后的记录列表,每个包含 {url, techs, server, title, status_code, content_length}
target_id: 目标 ID
Returns:
dict: {'updated_count': int, 'created_count': int}
"""
Model = _get_model_class(source)
table_name = Model._meta.db_table
from apps.asset.models import WebSite
table_name = WebSite._meta.db_table
updated_count = 0
created_count = 0
with connection.cursor() as cursor:
for url, techs in url_techs_map.items():
if not techs:
continue
for record in records:
url = record['url']
techs = record.get('techs', [])
server = record.get('server', '') or ''
title = record.get('title', '') or ''
status_code = record.get('status_code')
content_length = record.get('content_length')
# 先尝试更新(PostgreSQL 数组合并去重
sql = f"""
# 先尝试更新(合并策略
update_sql = f"""
UPDATE {table_name}
SET tech = (
SELECT ARRAY(SELECT DISTINCT unnest(
SET
tech = (SELECT ARRAY(SELECT DISTINCT unnest(
COALESCE(tech, ARRAY[]::varchar[]) || %s::varchar[]
))
)
))),
title = CASE WHEN title = '' OR title IS NULL THEN %s ELSE title END,
webserver = CASE WHEN webserver = '' OR webserver IS NULL THEN %s ELSE webserver END,
status_code = CASE WHEN status_code IS NULL THEN %s ELSE status_code END,
content_length = CASE WHEN content_length IS NULL THEN %s ELSE content_length END
WHERE url = %s AND target_id = %s
"""
cursor.execute(sql, [techs, url, target_id])
cursor.execute(update_sql, [techs, title, server, status_code, content_length, url, target_id])
if cursor.rowcount > 0:
updated_count += cursor.rowcount
@@ -114,20 +117,26 @@ def bulk_merge_tech_field(
# 插入新记录(带冲突处理)
insert_sql = f"""
INSERT INTO {table_name} (target_id, url, host, tech, created_at)
VALUES (%s, %s, %s, %s::varchar[], NOW())
INSERT INTO {table_name} (
target_id, url, host, location, title, webserver,
response_body, content_type, tech, status_code, content_length,
response_headers, created_at
)
VALUES (%s, %s, %s, '', %s, %s, '', '', %s::varchar[], %s, %s, '', NOW())
ON CONFLICT (target_id, url) DO UPDATE SET
tech = (
SELECT ARRAY(SELECT DISTINCT unnest(
COALESCE({table_name}.tech, ARRAY[]::varchar[]) || EXCLUDED.tech
))
)
tech = (SELECT ARRAY(SELECT DISTINCT unnest(
COALESCE({table_name}.tech, ARRAY[]::varchar[]) || EXCLUDED.tech
))),
title = CASE WHEN {table_name}.title = '' OR {table_name}.title IS NULL THEN EXCLUDED.title ELSE {table_name}.title END,
webserver = CASE WHEN {table_name}.webserver = '' OR {table_name}.webserver IS NULL THEN EXCLUDED.webserver ELSE {table_name}.webserver END,
status_code = CASE WHEN {table_name}.status_code IS NULL THEN EXCLUDED.status_code ELSE {table_name}.status_code END,
content_length = CASE WHEN {table_name}.content_length IS NULL THEN EXCLUDED.content_length ELSE {table_name}.content_length END
"""
cursor.execute(insert_sql, [target_id, url, host, techs])
cursor.execute(insert_sql, [target_id, url, host, title, server, techs, status_code, content_length])
created_count += 1
except Exception as e:
logger.warning("创建 %s 记录失败 (url=%s): %s", source, url, e)
logger.warning("创建 WebSite 记录失败 (url=%s): %s", url, e)
return {
'updated_count': updated_count,
@@ -141,12 +150,12 @@ def _parse_xingfinger_stream_output(
cwd: Optional[str] = None,
timeout: Optional[int] = None,
log_file: Optional[str] = None
) -> Generator[tuple[str, list[str]], None, None]:
) -> Generator[dict, None, None]:
"""
流式解析 xingfinger 命令输出
基于 execute_stream 实时处理 xingfinger 命令的 stdout将每行 JSON 输出
转换为 (url, tech_list) 格式
转换为完整字段字典
"""
logger.info("开始流式解析 xingfinger 命令输出 - 命令: %s", cmd)
@@ -193,43 +202,46 @@ def run_xingfinger_and_stream_update_tech_task(
batch_size: int = 100
) -> dict:
"""
流式执行 xingfinger 命令并实时更新 tech 字段
根据 source 参数更新对应表的 tech 字段:
- website → WebSite.tech
- endpoint → Endpoint.tech以后扩展
流式执行 xingfinger 命令,保存快照并合并更新资产表
处理流程:
1. 流式执行 xingfinger 命令
2. 实时解析 JSON 输出
3. 累积到 batch_size 条后批量更新数据库
4. 使用 PostgreSQL 原生 SQL 进行数组合并去重
5. 如果记录不存在,自动创建
2. 实时解析 JSON 输出(完整字段)
3. 累积到 batch_size 条后批量处理:
- 保存快照WebsiteSnapshot
- 合并更新资产表WebSite
合并策略:
- tech数组合并去重
- title, webserver, status_code, content_length只在原值为空时更新
Returns:
dict: {
'processed_records': int,
'updated_count': int,
'created_count': int,
'snapshot_count': int,
'batch_count': int
}
"""
logger.info(
"开始执行 xingfinger 并更新 tech - target_id=%s, source=%s, timeout=%s",
target_id, source, timeout
"开始执行 xingfinger - scan_id=%s, target_id=%s, timeout=%s",
scan_id, target_id, timeout
)
data_generator = None
snapshot_repo = DjangoWebsiteSnapshotRepository()
try:
# 初始化统计
processed_records = 0
updated_count = 0
created_count = 0
snapshot_count = 0
batch_count = 0
# 当前批次的 URL -> techs 映射
url_techs_map = {}
# 当前批次的记录列表
batch_records = []
# 流式处理
data_generator = _parse_xingfinger_stream_output(
@@ -240,47 +252,43 @@ def run_xingfinger_and_stream_update_tech_task(
log_file=log_file
)
for url, techs in data_generator:
for record in data_generator:
processed_records += 1
batch_records.append(record)
# 累积到 url_techs_map
if url in url_techs_map:
# 合并同一 URL 的多次识别结果
url_techs_map[url].extend(techs)
else:
url_techs_map[url] = techs
# 达到批次大小,执行批量更新
if len(url_techs_map) >= batch_size:
# 达到批次大小,执行批量处理
if len(batch_records) >= batch_size:
batch_count += 1
result = bulk_merge_tech_field(source, url_techs_map, target_id)
updated_count += result['updated_count']
created_count += result.get('created_count', 0)
logger.debug(
"批次 %d 完成 - 更新: %d, 创建: %d",
batch_count, result['updated_count'], result.get('created_count', 0)
result = _process_batch(
batch_records, scan_id, target_id, batch_count, snapshot_repo
)
updated_count += result['updated_count']
created_count += result['created_count']
snapshot_count += result['snapshot_count']
# 清空批次
url_techs_map = {}
batch_records = []
# 处理最后一批
if url_techs_map:
if batch_records:
batch_count += 1
result = bulk_merge_tech_field(source, url_techs_map, target_id)
result = _process_batch(
batch_records, scan_id, target_id, batch_count, snapshot_repo
)
updated_count += result['updated_count']
created_count += result.get('created_count', 0)
created_count += result['created_count']
snapshot_count += result['snapshot_count']
logger.info(
"✓ xingfinger 执行完成 - 处理记录: %d, 更新: %d, 创建: %d, 批次: %d",
processed_records, updated_count, created_count, batch_count
"✓ xingfinger 执行完成 - 处理: %d, 更新: %d, 创建: %d, 快照: %d, 批次: %d",
processed_records, updated_count, created_count, snapshot_count, batch_count
)
return {
'processed_records': processed_records,
'updated_count': updated_count,
'created_count': created_count,
'snapshot_count': snapshot_count,
'batch_count': batch_count
}
@@ -298,3 +306,67 @@ def run_xingfinger_and_stream_update_tech_task(
data_generator.close()
except Exception as e:
logger.debug("关闭生成器时出错: %s", e)
def _process_batch(
records: list[dict],
scan_id: int,
target_id: int,
batch_num: int,
snapshot_repo: DjangoWebsiteSnapshotRepository
) -> dict:
"""
处理一个批次的数据:保存快照 + 合并更新资产表
Args:
records: 解析后的记录列表
scan_id: 扫描任务 ID
target_id: 目标 ID
batch_num: 批次编号
snapshot_repo: 快照仓库
Returns:
dict: {'updated_count': int, 'created_count': int, 'snapshot_count': int}
"""
# 1. 构建快照 DTO 列表
snapshot_dtos = []
for record in records:
# 从 URL 提取 host
parsed = urlparse(record['url'])
host = parsed.hostname or ''
dto = WebsiteSnapshotDTO(
scan_id=scan_id,
target_id=target_id,
url=record['url'],
host=host,
title=record.get('title', '') or '',
status=record.get('status_code'),
content_length=record.get('content_length'),
web_server=record.get('server', '') or '',
tech=record.get('techs', []),
)
snapshot_dtos.append(dto)
# 2. 保存快照
snapshot_count = 0
if snapshot_dtos:
try:
snapshot_repo.save_snapshots(snapshot_dtos)
snapshot_count = len(snapshot_dtos)
except Exception as e:
logger.warning("批次 %d 保存快照失败: %s", batch_num, e)
# 3. 合并更新资产表
merge_result = bulk_merge_website_fields(records, target_id)
logger.debug(
"批次 %d 完成 - 更新: %d, 创建: %d, 快照: %d",
batch_num, merge_result['updated_count'], merge_result['created_count'], snapshot_count
)
return {
'updated_count': merge_result['updated_count'],
'created_count': merge_result['created_count'],
'snapshot_count': snapshot_count
}

View File

@@ -129,11 +129,12 @@ class HttpxRecord:
self.content_type = data.get('content_type', '')
self.location = data.get('location', '')
self.webserver = data.get('webserver', '')
self.body_preview = data.get('body_preview', '')
self.response_body = data.get('body', '') # 从 body 字段获取完整响应体
self.tech = data.get('tech', [])
self.vhost = data.get('vhost')
self.failed = data.get('failed', False)
self.timestamp = data.get('timestamp')
self.response_headers = data.get('raw_header', '') # 从 raw_header 字段获取原始响应头字符串
# 从 URL 中提取主机名
self.host = self._extract_hostname()
@@ -354,12 +355,13 @@ def _save_batch(
location=record.location, # location 字段保存重定向信息
title=record.title[:1000] if record.title else '',
web_server=record.webserver[:200] if record.webserver else '',
body_preview=record.body_preview[:1000] if record.body_preview else '',
response_body=record.response_body if record.response_body else '',
content_type=record.content_type[:200] if record.content_type else '',
tech=record.tech if isinstance(record.tech, list) else [],
status=record.status_code,
content_length=record.content_length,
vhost=record.vhost
vhost=record.vhost,
response_headers=record.response_headers if record.response_headers else '',
)
snapshot_items.append(snapshot_dto)

View File

@@ -2,8 +2,8 @@
基于 execute_stream 的流式 URL 验证任务
主要功能:
1. 实时执行 httpx 命令验证 URL 存活
2. 流式处理命令输出,解析存活的 URL
1. 实时执行 httpx 命令验证 URL
2. 流式处理命令输出,解析 URL 信息
3. 批量保存到数据库Endpoint 表)
4. 避免一次性加载所有 URL 到内存
@@ -14,7 +14,7 @@
- 使用 execute_stream 实时处理输出
- 流式处理避免内存溢出
- 批量操作减少数据库交互
- 保存存活的 URLstatus 2xx/3xx
- 保存所有有效 URL包括 4xx/5xx,便于安全分析
"""
import logging
@@ -73,7 +73,7 @@ def _parse_and_validate_line(line: str) -> Optional[dict]:
Returns:
Optional[dict]: 有效的 httpx 记录,或 None 如果验证失败
只返回存活的 URL2xx/3xx 状态码
保存所有有效 URL不再过滤状态码安全扫描中 403/404/500 等也有分析价值
"""
try:
# 清理 NUL 字符后再解析 JSON
@@ -99,24 +99,21 @@ def _parse_and_validate_line(line: str) -> Optional[dict]:
logger.info("URL 为空,跳过 - 数据: %s", str(line_data)[:200])
return None
# 保存存活的 URL2xx 或 3xx
if status_code and (200 <= status_code < 400):
return {
'url': _sanitize_string(url),
'host': _sanitize_string(line_data.get('host', '')),
'status_code': status_code,
'title': _sanitize_string(line_data.get('title', '')),
'content_length': line_data.get('content_length', 0),
'content_type': _sanitize_string(line_data.get('content_type', '')),
'webserver': _sanitize_string(line_data.get('webserver', '')),
'location': _sanitize_string(line_data.get('location', '')),
'tech': line_data.get('tech', []),
'body_preview': _sanitize_string(line_data.get('body_preview', '')),
'vhost': line_data.get('vhost', False),
}
else:
logger.debug("URL 不存活(状态码: %s),跳过: %s", status_code, url)
return None
# 保存所有有效 URL不再过滤状态码
return {
'url': _sanitize_string(url),
'host': _sanitize_string(line_data.get('host', '')),
'status_code': status_code,
'title': _sanitize_string(line_data.get('title', '')),
'content_length': line_data.get('content_length', 0),
'content_type': _sanitize_string(line_data.get('content_type', '')),
'webserver': _sanitize_string(line_data.get('webserver', '')),
'location': _sanitize_string(line_data.get('location', '')),
'tech': line_data.get('tech', []),
'response_body': _sanitize_string(line_data.get('body', '')),
'vhost': line_data.get('vhost', False),
'response_headers': _sanitize_string(line_data.get('raw_header', '')),
}
except Exception:
logger.info("跳过无法解析的行: %s", line[:100] if line else 'empty')
@@ -302,10 +299,11 @@ def _save_batch(
webserver=record.get('webserver', ''),
content_type=record.get('content_type', ''),
tech=record.get('tech', []),
body_preview=record.get('body_preview', ''),
response_body=record.get('response_body', ''),
vhost=record.get('vhost', False),
matched_gf_patterns=[],
target_id=target_id,
response_headers=record.get('response_headers', ''),
)
snapshots.append(dto)
except Exception as e:

View File

@@ -0,0 +1,80 @@
"""
配置合并工具模块
提供多引擎 YAML 配置的冲突检测和合并功能。
"""
from typing import List, Tuple
import yaml
class ConfigConflictError(Exception):
"""配置冲突异常
当两个或多个引擎定义相同的顶层扫描类型键时抛出。
"""
def __init__(self, conflicts: List[Tuple[str, str, str]]):
"""
参数:
conflicts: (键, 引擎1名称, 引擎2名称) 元组列表
"""
self.conflicts = conflicts
msg = "; ".join([f"{k} 同时存在于「{e1}」和「{e2}" for k, e1, e2 in conflicts])
super().__init__(f"扫描类型冲突: {msg}")
def merge_engine_configs(engines: List[Tuple[str, str]]) -> str:
"""
合并多个引擎的 YAML 配置。
参数:
engines: (引擎名称, 配置YAML) 元组列表
返回:
合并后的 YAML 字符串
异常:
ConfigConflictError: 当顶层键冲突时
"""
if not engines:
return ""
if len(engines) == 1:
return engines[0][1]
# 追踪每个顶层键属于哪个引擎
key_to_engine: dict[str, str] = {}
conflicts: List[Tuple[str, str, str]] = []
for engine_name, config_yaml in engines:
if not config_yaml or not config_yaml.strip():
continue
try:
parsed = yaml.safe_load(config_yaml)
except yaml.YAMLError:
# 无效 YAML 跳过
continue
if not isinstance(parsed, dict):
continue
# 检查顶层键冲突
for key in parsed.keys():
if key in key_to_engine:
conflicts.append((key, key_to_engine[key], engine_name))
else:
key_to_engine[key] = engine_name
if conflicts:
raise ConfigConflictError(conflicts)
# 无冲突,用双换行符连接配置
configs = []
for _, config_yaml in engines:
if config_yaml and config_yaml.strip():
configs.append(config_yaml.strip())
return "\n\n".join(configs)

View File

@@ -36,7 +36,14 @@ def _normalize_config_keys(config: Dict[str, Any]) -> Dict[str, Any]:
Returns:
key 已转换的新字典
Raises:
ValueError: 配置为 None 或非字典类型时抛出
"""
if config is None:
raise ValueError("配置不能为空None请检查 YAML 格式,确保冒号后有配置内容或使用 {} 表示空配置")
if not isinstance(config, dict):
raise ValueError(f"配置格式错误:期望 dict实际 {type(config).__name__}")
return {
k.replace('-', '_') if isinstance(k, str) else k: v
for k, v in config.items()

View File

@@ -18,6 +18,9 @@ FINGERPRINT_LIB_MAP = {
'ehole': 'ensure_ehole_fingerprint_local',
'goby': 'ensure_goby_fingerprint_local',
'wappalyzer': 'ensure_wappalyzer_fingerprint_local',
'fingers': 'ensure_fingers_fingerprint_local',
'fingerprinthub': 'ensure_fingerprinthub_fingerprint_local',
'arl': 'ensure_arl_fingerprint_local',
}
@@ -221,10 +224,170 @@ def get_fingerprint_paths(lib_names: list) -> dict:
return paths
def ensure_fingers_fingerprint_local() -> str:
"""
确保本地存在最新的 Fingers 指纹文件(带缓存)
Returns:
str: 本地指纹文件路径
"""
from apps.engine.services.fingerprints import FingersFingerprintService
service = FingersFingerprintService()
current_version = service.get_fingerprint_version()
# 缓存目录和文件
base_dir = getattr(settings, 'FINGERPRINTS_BASE_PATH', '/opt/xingrin/fingerprints')
os.makedirs(base_dir, exist_ok=True)
cache_file = os.path.join(base_dir, 'fingers.json')
version_file = os.path.join(base_dir, 'fingers.version')
# 检查缓存版本
cached_version = None
if os.path.exists(version_file):
try:
with open(version_file, 'r') as f:
cached_version = f.read().strip()
except OSError as e:
logger.warning("读取 Fingers 版本文件失败: %s", e)
# 版本匹配,直接返回缓存
if cached_version == current_version and os.path.exists(cache_file):
logger.info("Fingers 指纹文件缓存有效(版本匹配): %s", cache_file)
return cache_file
# 版本不匹配,重新导出
logger.info(
"Fingers 指纹文件需要更新: cached=%s, current=%s",
cached_version, current_version
)
data = service.get_export_data()
with open(cache_file, 'w', encoding='utf-8') as f:
json.dump(data, f, ensure_ascii=False)
# 写入版本文件
try:
with open(version_file, 'w') as f:
f.write(current_version)
except OSError as e:
logger.warning("写入 Fingers 版本文件失败: %s", e)
logger.info("Fingers 指纹文件已更新: %s", cache_file)
return cache_file
def ensure_fingerprinthub_fingerprint_local() -> str:
"""
确保本地存在最新的 FingerPrintHub 指纹文件(带缓存)
Returns:
str: 本地指纹文件路径
"""
from apps.engine.services.fingerprints import FingerPrintHubFingerprintService
service = FingerPrintHubFingerprintService()
current_version = service.get_fingerprint_version()
# 缓存目录和文件
base_dir = getattr(settings, 'FINGERPRINTS_BASE_PATH', '/opt/xingrin/fingerprints')
os.makedirs(base_dir, exist_ok=True)
cache_file = os.path.join(base_dir, 'fingerprinthub.json')
version_file = os.path.join(base_dir, 'fingerprinthub.version')
# 检查缓存版本
cached_version = None
if os.path.exists(version_file):
try:
with open(version_file, 'r') as f:
cached_version = f.read().strip()
except OSError as e:
logger.warning("读取 FingerPrintHub 版本文件失败: %s", e)
# 版本匹配,直接返回缓存
if cached_version == current_version and os.path.exists(cache_file):
logger.info("FingerPrintHub 指纹文件缓存有效(版本匹配): %s", cache_file)
return cache_file
# 版本不匹配,重新导出
logger.info(
"FingerPrintHub 指纹文件需要更新: cached=%s, current=%s",
cached_version, current_version
)
data = service.get_export_data()
with open(cache_file, 'w', encoding='utf-8') as f:
json.dump(data, f, ensure_ascii=False)
# 写入版本文件
try:
with open(version_file, 'w') as f:
f.write(current_version)
except OSError as e:
logger.warning("写入 FingerPrintHub 版本文件失败: %s", e)
logger.info("FingerPrintHub 指纹文件已更新: %s", cache_file)
return cache_file
def ensure_arl_fingerprint_local() -> str:
"""
确保本地存在最新的 ARL 指纹文件(带缓存)
Returns:
str: 本地指纹文件路径YAML 格式)
"""
import yaml
from apps.engine.services.fingerprints import ARLFingerprintService
service = ARLFingerprintService()
current_version = service.get_fingerprint_version()
# 缓存目录和文件
base_dir = getattr(settings, 'FINGERPRINTS_BASE_PATH', '/opt/xingrin/fingerprints')
os.makedirs(base_dir, exist_ok=True)
cache_file = os.path.join(base_dir, 'arl.yaml')
version_file = os.path.join(base_dir, 'arl.version')
# 检查缓存版本
cached_version = None
if os.path.exists(version_file):
try:
with open(version_file, 'r') as f:
cached_version = f.read().strip()
except OSError as e:
logger.warning("读取 ARL 版本文件失败: %s", e)
# 版本匹配,直接返回缓存
if cached_version == current_version and os.path.exists(cache_file):
logger.info("ARL 指纹文件缓存有效(版本匹配): %s", cache_file)
return cache_file
# 版本不匹配,重新导出
logger.info(
"ARL 指纹文件需要更新: cached=%s, current=%s",
cached_version, current_version
)
data = service.get_export_data()
with open(cache_file, 'w', encoding='utf-8') as f:
yaml.dump(data, f, allow_unicode=True, default_flow_style=False)
# 写入版本文件
try:
with open(version_file, 'w') as f:
f.write(current_version)
except OSError as e:
logger.warning("写入 ARL 版本文件失败: %s", e)
logger.info("ARL 指纹文件已更新: %s", cache_file)
return cache_file
__all__ = [
"ensure_ehole_fingerprint_local",
"ensure_goby_fingerprint_local",
"ensure_wappalyzer_fingerprint_local",
"ensure_fingers_fingerprint_local",
"ensure_fingerprinthub_fingerprint_local",
"ensure_arl_fingerprint_local",
"get_fingerprint_paths",
"FINGERPRINT_LIB_MAP",
]

View File

@@ -19,7 +19,6 @@ from typing import Optional
from django.conf import settings
from apps.common.utils.git_proxy import get_git_proxy_url
from apps.engine.models import NucleiTemplateRepo
logger = logging.getLogger(__name__)
@@ -50,7 +49,7 @@ def get_local_commit_hash(local_path: Path) -> Optional[str]:
def git_clone(repo_url: str, local_path: Path) -> bool:
"""Git clone 仓库(支持 Git 加速)
"""Git clone 仓库
Args:
repo_url: 仓库 URL
@@ -59,15 +58,9 @@ def git_clone(repo_url: str, local_path: Path) -> bool:
Returns:
是否成功
"""
# Transform URL for Git acceleration if enabled
proxied_url = get_git_proxy_url(repo_url)
if proxied_url != repo_url:
logger.info("Using Git acceleration: %s -> %s", repo_url, proxied_url)
logger.info("正在 clone 模板仓库: %s -> %s", repo_url, local_path)
result = subprocess.run(
["git", "clone", "--depth", "1", proxied_url, str(local_path)],
["git", "clone", "--depth", "1", repo_url, str(local_path)],
check=False,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,

View File

@@ -96,7 +96,13 @@ def ensure_wordlist_local(wordlist_name: str) -> str:
ssl_context.check_hostname = False
ssl_context.verify_mode = ssl.CERT_NONE
with urllib_request.urlopen(download_url, context=ssl_context) as resp:
# 创建带 API Key 的请求
req = urllib_request.Request(download_url)
worker_api_key = os.getenv('WORKER_API_KEY', '')
if worker_api_key:
req.add_header('X-Worker-API-Key', worker_api_key)
with urllib_request.urlopen(req, context=ssl_context) as resp:
if resp.status != 200:
raise RuntimeError(f"下载字典失败HTTP {resp.status}")
data = resp.read()

View File

@@ -7,6 +7,10 @@ from django.core.exceptions import ObjectDoesNotExist, ValidationError
from django.db.utils import DatabaseError, IntegrityError, OperationalError
import logging
from apps.common.response_helpers import success_response, error_response
from apps.common.error_codes import ErrorCodes
from apps.scan.utils.config_merger import ConfigConflictError
logger = logging.getLogger(__name__)
from ..models import Scan, ScheduledScan
@@ -75,20 +79,31 @@ class ScanViewSet(viewsets.ModelViewSet):
scan_service = ScanService()
result = scan_service.delete_scans_two_phase([scan.id])
return Response({
'message': f'已删除扫描任务: Scan #{scan.id}',
'scanId': scan.id,
'deletedCount': result['soft_deleted_count'],
'deletedScans': result['scan_names']
}, status=status.HTTP_200_OK)
return success_response(
data={
'scanId': scan.id,
'deletedCount': result['soft_deleted_count'],
'deletedScans': result['scan_names']
}
)
except Scan.DoesNotExist:
raise NotFound('扫描任务不存在')
return error_response(
code=ErrorCodes.NOT_FOUND,
status_code=status.HTTP_404_NOT_FOUND
)
except ValueError as e:
raise NotFound(str(e))
return error_response(
code=ErrorCodes.NOT_FOUND,
message=str(e),
status_code=status.HTTP_404_NOT_FOUND
)
except Exception as e:
logger.exception("删除扫描任务时发生错误")
raise APIException('服务器错误,请稍后重试')
return error_response(
code=ErrorCodes.SERVER_ERROR,
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
@action(detail=False, methods=['post'])
def quick(self, request):
@@ -104,7 +119,7 @@ class ScanViewSet(viewsets.ModelViewSet):
请求参数:
{
"targets": [{"name": "example.com"}, {"name": "https://example.com/api"}],
"engine_id": 1
"engine_ids": [1, 2]
}
支持的输入格式:
@@ -119,7 +134,7 @@ class ScanViewSet(viewsets.ModelViewSet):
serializer.is_valid(raise_exception=True)
targets_data = serializer.validated_data['targets']
engine_id = serializer.validated_data.get('engine_id')
engine_ids = serializer.validated_data.get('engine_ids')
try:
# 提取输入字符串列表
@@ -127,47 +142,82 @@ class ScanViewSet(viewsets.ModelViewSet):
# 1. 使用 QuickScanService 解析输入并创建资产
quick_scan_service = QuickScanService()
result = quick_scan_service.process_quick_scan(inputs, engine_id)
result = quick_scan_service.process_quick_scan(inputs, engine_ids[0] if engine_ids else None)
targets = result['targets']
if not targets:
return Response({
'error': '没有有效的目标可供扫描',
'errors': result.get('errors', [])
}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='No valid targets for scanning',
details=result.get('errors', []),
status_code=status.HTTP_400_BAD_REQUEST
)
# 2. 获取扫描引擎
engine_service = EngineService()
engine = engine_service.get_engine(engine_id)
if not engine:
raise ValidationError(f'扫描引擎 ID {engine_id} 不存在')
# 2. 准备多引擎扫描
scan_service = ScanService()
_, merged_configuration, engine_names, engine_ids = scan_service.prepare_initiate_scan_multi_engine(
target_id=targets[0].id, # 使用第一个目标来验证引擎
engine_ids=engine_ids
)
# 3. 批量发起扫描
scan_service = ScanService()
created_scans = scan_service.create_scans(
targets=targets,
engine=engine
engine_ids=engine_ids,
engine_names=engine_names,
merged_configuration=merged_configuration
)
# 检查是否成功创建扫描任务
if not created_scans:
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='No scan tasks were created. All targets may already have active scans.',
details={
'targetStats': result['target_stats'],
'assetStats': result['asset_stats'],
'errors': result.get('errors', [])
},
status_code=status.HTTP_422_UNPROCESSABLE_ENTITY
)
# 序列化返回结果
scan_serializer = ScanSerializer(created_scans, many=True)
return Response({
'message': f'快速扫描已启动:{len(created_scans)} 个任务',
'target_stats': result['target_stats'],
'asset_stats': result['asset_stats'],
'errors': result.get('errors', []),
'scans': scan_serializer.data
}, status=status.HTTP_201_CREATED)
return success_response(
data={
'count': len(created_scans),
'targetStats': result['target_stats'],
'assetStats': result['asset_stats'],
'errors': result.get('errors', []),
'scans': scan_serializer.data
},
status_code=status.HTTP_201_CREATED
)
except ConfigConflictError as e:
return error_response(
code='CONFIG_CONFLICT',
message=str(e),
details=[
{'key': k, 'engines': [e1, e2]}
for k, e1, e2 in e.conflicts
],
status_code=status.HTTP_400_BAD_REQUEST
)
except ValidationError as e:
return Response({'error': str(e)}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message=str(e),
status_code=status.HTTP_400_BAD_REQUEST
)
except Exception as e:
logger.exception("快速扫描启动失败")
return Response(
{'error': '服务器内部错误,请稍后重试'},
status=status.HTTP_500_INTERNAL_SERVER_ERROR
return error_response(
code=ErrorCodes.SERVER_ERROR,
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
@action(detail=False, methods=['post'])
@@ -178,7 +228,7 @@ class ScanViewSet(viewsets.ModelViewSet):
请求参数:
- organization_id: 组织ID (int, 可选)
- target_id: 目标ID (int, 可选)
- engine_id: 扫描引擎ID (int, 必填)
- engine_ids: 扫描引擎ID列表 (list[int], 必填)
注意: organization_id 和 target_id 二选一
@@ -188,55 +238,92 @@ class ScanViewSet(viewsets.ModelViewSet):
# 获取请求数据
organization_id = request.data.get('organization_id')
target_id = request.data.get('target_id')
engine_id = request.data.get('engine_id')
engine_ids = request.data.get('engine_ids')
# 验证 engine_ids
if not engine_ids:
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='缺少必填参数: engine_ids',
status_code=status.HTTP_400_BAD_REQUEST
)
if not isinstance(engine_ids, list):
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='engine_ids 必须是数组',
status_code=status.HTTP_400_BAD_REQUEST
)
try:
# 步骤1准备扫描所需的数据(验证参数、查询资源、返回目标列表和引擎)
# 步骤1准备多引擎扫描所需的数据
scan_service = ScanService()
targets, engine = scan_service.prepare_initiate_scan(
targets, merged_configuration, engine_names, engine_ids = scan_service.prepare_initiate_scan_multi_engine(
organization_id=organization_id,
target_id=target_id,
engine_id=engine_id
engine_ids=engine_ids
)
# 步骤2批量创建扫描记录并分发扫描任务
created_scans = scan_service.create_scans(
targets=targets,
engine=engine
engine_ids=engine_ids,
engine_names=engine_names,
merged_configuration=merged_configuration
)
# 检查是否成功创建扫描任务
if not created_scans:
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='No scan tasks were created. All targets may already have active scans.',
status_code=status.HTTP_422_UNPROCESSABLE_ENTITY
)
# 序列化返回结果
scan_serializer = ScanSerializer(created_scans, many=True)
return Response(
{
'message': f'已成功发起 {len(created_scans)} 个扫描任务',
return success_response(
data={
'count': len(created_scans),
'scans': scan_serializer.data
},
status=status.HTTP_201_CREATED
status_code=status.HTTP_201_CREATED
)
except ConfigConflictError as e:
return error_response(
code='CONFIG_CONFLICT',
message=str(e),
details=[
{'key': k, 'engines': [e1, e2]}
for k, e1, e2 in e.conflicts
],
status_code=status.HTTP_400_BAD_REQUEST
)
except ObjectDoesNotExist as e:
# 资源不存在错误(由 service 层抛出)
error_msg = str(e)
return Response(
{'error': error_msg},
status=status.HTTP_404_NOT_FOUND
return error_response(
code=ErrorCodes.NOT_FOUND,
message=str(e),
status_code=status.HTTP_404_NOT_FOUND
)
except ValidationError as e:
# 参数验证错误(由 service 层抛出)
return Response(
{'error': str(e)},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message=str(e),
status_code=status.HTTP_400_BAD_REQUEST
)
except (DatabaseError, IntegrityError, OperationalError):
# 数据库错误
return Response(
{'error': '数据库错误,请稍后重试'},
status=status.HTTP_503_SERVICE_UNAVAILABLE
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Database error',
status_code=status.HTTP_503_SERVICE_UNAVAILABLE
)
# 所有快照相关的 action 和 export 已迁移到 asset/views.py 中的快照 ViewSet
@@ -278,21 +365,24 @@ class ScanViewSet(viewsets.ModelViewSet):
# 参数验证
if not ids:
return Response(
{'error': '缺少必填参数: ids'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='Missing required parameter: ids',
status_code=status.HTTP_400_BAD_REQUEST
)
if not isinstance(ids, list):
return Response(
{'error': 'ids 必须是数组'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='ids must be an array',
status_code=status.HTTP_400_BAD_REQUEST
)
if not all(isinstance(i, int) for i in ids):
return Response(
{'error': 'ids 数组中的所有元素必须是整数'},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message='All elements in ids array must be integers',
status_code=status.HTTP_400_BAD_REQUEST
)
try:
@@ -300,19 +390,27 @@ class ScanViewSet(viewsets.ModelViewSet):
scan_service = ScanService()
result = scan_service.delete_scans_two_phase(ids)
return Response({
'message': f"已删除 {result['soft_deleted_count']} 个扫描任务",
'deletedCount': result['soft_deleted_count'],
'deletedScans': result['scan_names']
}, status=status.HTTP_200_OK)
return success_response(
data={
'deletedCount': result['soft_deleted_count'],
'deletedScans': result['scan_names']
}
)
except ValueError as e:
# 未找到记录
raise NotFound(str(e))
return error_response(
code=ErrorCodes.NOT_FOUND,
message=str(e),
status_code=status.HTTP_404_NOT_FOUND
)
except Exception as e:
logger.exception("批量删除扫描任务时发生错误")
raise APIException('服务器错误,请稍后重试')
return error_response(
code=ErrorCodes.SERVER_ERROR,
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
@action(detail=False, methods=['get'])
def statistics(self, request):
@@ -337,22 +435,25 @@ class ScanViewSet(viewsets.ModelViewSet):
scan_service = ScanService()
stats = scan_service.get_statistics()
return Response({
'total': stats['total'],
'running': stats['running'],
'completed': stats['completed'],
'failed': stats['failed'],
'totalVulns': stats['total_vulns'],
'totalSubdomains': stats['total_subdomains'],
'totalEndpoints': stats['total_endpoints'],
'totalWebsites': stats['total_websites'],
'totalAssets': stats['total_assets'],
})
return success_response(
data={
'total': stats['total'],
'running': stats['running'],
'completed': stats['completed'],
'failed': stats['failed'],
'totalVulns': stats['total_vulns'],
'totalSubdomains': stats['total_subdomains'],
'totalEndpoints': stats['total_endpoints'],
'totalWebsites': stats['total_websites'],
'totalAssets': stats['total_assets'],
}
)
except (DatabaseError, OperationalError):
return Response(
{'error': '数据库错误,请稍后重试'},
status=status.HTTP_503_SERVICE_UNAVAILABLE
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Database error',
status_code=status.HTTP_503_SERVICE_UNAVAILABLE
)
@action(detail=True, methods=['post'])
@@ -383,35 +484,31 @@ class ScanViewSet(viewsets.ModelViewSet):
# 检查是否是状态不允许的问题
scan = scan_service.get_scan(scan_id=pk, prefetch_relations=False)
if scan and scan.status not in [ScanStatus.RUNNING, ScanStatus.INITIATED]:
return Response(
{
'error': f'无法停止扫描:当前状态为 {ScanStatus(scan.status).label}',
'detail': '只能停止运行中或初始化状态的扫描'
},
status=status.HTTP_400_BAD_REQUEST
return error_response(
code=ErrorCodes.BAD_REQUEST,
message=f'Cannot stop scan: current status is {ScanStatus(scan.status).label}',
status_code=status.HTTP_400_BAD_REQUEST
)
# 其他失败原因
return Response(
{'error': '停止扫描失败'},
status=status.HTTP_500_INTERNAL_SERVER_ERROR
return error_response(
code=ErrorCodes.SERVER_ERROR,
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
return Response(
{
'message': f'扫描已停止,已撤销 {revoked_count} 个任务',
'revokedTaskCount': revoked_count
},
status=status.HTTP_200_OK
return success_response(
data={'revokedTaskCount': revoked_count}
)
except ObjectDoesNotExist:
return Response(
{'error': f'扫描 ID {pk} 不存在'},
status=status.HTTP_404_NOT_FOUND
return error_response(
code=ErrorCodes.NOT_FOUND,
message=f'Scan ID {pk} not found',
status_code=status.HTTP_404_NOT_FOUND
)
except (DatabaseError, IntegrityError, OperationalError):
return Response(
{'error': '数据库错误,请稍后重试'},
status=status.HTTP_503_SERVICE_UNAVAILABLE
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Database error',
status_code=status.HTTP_503_SERVICE_UNAVAILABLE
)

View File

@@ -17,7 +17,10 @@ from ..serializers import (
)
from ..services.scheduled_scan_service import ScheduledScanService
from ..repositories import ScheduledScanDTO
from ..utils.config_merger import ConfigConflictError
from apps.common.pagination import BasePagination
from apps.common.response_helpers import success_response, error_response
from apps.common.error_codes import ErrorCodes
logger = logging.getLogger(__name__)
@@ -65,7 +68,7 @@ class ScheduledScanViewSet(viewsets.ModelViewSet):
data = serializer.validated_data
dto = ScheduledScanDTO(
name=data['name'],
engine_id=data['engine_id'],
engine_ids=data['engine_ids'],
organization_id=data.get('organization_id'),
target_id=data.get('target_id'),
cron_expression=data.get('cron_expression', '0 2 * * *'),
@@ -75,15 +78,26 @@ class ScheduledScanViewSet(viewsets.ModelViewSet):
scheduled_scan = self.service.create(dto)
response_serializer = ScheduledScanSerializer(scheduled_scan)
return Response(
{
'message': f'创建定时扫描任务成功: {scheduled_scan.name}',
'scheduled_scan': response_serializer.data
},
status=status.HTTP_201_CREATED
return success_response(
data=response_serializer.data,
status_code=status.HTTP_201_CREATED
)
except ConfigConflictError as e:
return error_response(
code='CONFIG_CONFLICT',
message=str(e),
details=[
{'key': k, 'engines': [e1, e2]}
for k, e1, e2 in e.conflicts
],
status_code=status.HTTP_400_BAD_REQUEST
)
except ValidationError as e:
return Response({'error': str(e)}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message=str(e),
status_code=status.HTTP_400_BAD_REQUEST
)
def update(self, request, *args, **kwargs):
"""更新定时扫描任务"""
@@ -95,7 +109,7 @@ class ScheduledScanViewSet(viewsets.ModelViewSet):
data = serializer.validated_data
dto = ScheduledScanDTO(
name=data.get('name'),
engine_id=data.get('engine_id'),
engine_ids=data.get('engine_ids'),
organization_id=data.get('organization_id'),
target_id=data.get('target_id'),
cron_expression=data.get('cron_expression'),
@@ -105,24 +119,37 @@ class ScheduledScanViewSet(viewsets.ModelViewSet):
scheduled_scan = self.service.update(instance.id, dto)
response_serializer = ScheduledScanSerializer(scheduled_scan)
return Response({
'message': f'更新定时扫描任务成功: {scheduled_scan.name}',
'scheduled_scan': response_serializer.data
})
return success_response(data=response_serializer.data)
except ConfigConflictError as e:
return error_response(
code='CONFIG_CONFLICT',
message=str(e),
details=[
{'key': k, 'engines': [e1, e2]}
for k, e1, e2 in e.conflicts
],
status_code=status.HTTP_400_BAD_REQUEST
)
except ValidationError as e:
return Response({'error': str(e)}, status=status.HTTP_400_BAD_REQUEST)
return error_response(
code=ErrorCodes.VALIDATION_ERROR,
message=str(e),
status_code=status.HTTP_400_BAD_REQUEST
)
def destroy(self, request, *args, **kwargs):
"""删除定时扫描任务"""
instance = self.get_object()
scan_id = instance.id
name = instance.name
if self.service.delete(instance.id):
return Response({
'message': f'删除定时扫描任务成功: {name}',
'id': instance.id
})
return Response({'error': '删除失败'}, status=status.HTTP_500_INTERNAL_SERVER_ERROR)
if self.service.delete(scan_id):
return success_response(data={'id': scan_id, 'name': name})
return error_response(
code=ErrorCodes.SERVER_ERROR,
message='Failed to delete scheduled scan',
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR
)
@action(detail=True, methods=['post'])
def toggle(self, request, pk=None):
@@ -136,14 +163,11 @@ class ScheduledScanViewSet(viewsets.ModelViewSet):
scheduled_scan = self.get_object()
response_serializer = ScheduledScanSerializer(scheduled_scan)
status_text = '启用' if is_enabled else '禁用'
return Response({
'message': f'{status_text}定时扫描任务',
'scheduled_scan': response_serializer.data
})
return success_response(data=response_serializer.data)
return Response(
{'error': f'定时扫描任务 ID {pk} 不存在或操作失败'},
status=status.HTTP_404_NOT_FOUND
return error_response(
code=ErrorCodes.NOT_FOUND,
message=f'Scheduled scan with ID {pk} not found or operation failed',
status_code=status.HTTP_404_NOT_FOUND
)

View File

@@ -10,6 +10,7 @@ from .serializers import OrganizationSerializer, TargetSerializer, TargetDetailS
from .services.target_service import TargetService
from .services.organization_service import OrganizationService
from apps.common.pagination import BasePagination
from apps.common.response_helpers import success_response
logger = logging.getLogger(__name__)
@@ -94,9 +95,8 @@ class OrganizationViewSet(viewsets.ModelViewSet):
# 批量解除关联(直接使用 ID避免查询对象
organization.targets.remove(*existing_target_ids)
return Response({
'unlinked_count': existing_count,
'message': f'成功解除 {existing_count} 个目标的关联'
return success_response(data={
'unlinkedCount': existing_count
})
def destroy(self, request, *args, **kwargs):
@@ -124,13 +124,12 @@ class OrganizationViewSet(viewsets.ModelViewSet):
# 直接调用 Service 层的业务方法(软删除 + 分发硬删除任务)
result = self.org_service.delete_organizations_two_phase([organization.id])
return Response({
'message': f'已删除组织: {organization.name}',
return success_response(data={
'organizationId': organization.id,
'organizationName': organization.name,
'deletedCount': result['soft_deleted_count'],
'deletedOrganizations': result['organization_names']
}, status=200)
})
except Organization.DoesNotExist:
raise NotFound('组织不存在')
@@ -181,11 +180,10 @@ class OrganizationViewSet(viewsets.ModelViewSet):
# 调用 Service 层的业务方法(软删除 + 分发硬删除任务)
result = self.org_service.delete_organizations_two_phase(ids)
return Response({
'message': f"已删除 {result['soft_deleted_count']} 个组织",
return success_response(data={
'deletedCount': result['soft_deleted_count'],
'deletedOrganizations': result['organization_names']
}, status=200)
})
except ValueError as e:
raise NotFound(str(e))
@@ -271,12 +269,11 @@ class TargetViewSet(viewsets.ModelViewSet):
# 直接调用 Service 层的业务方法(软删除 + 分发硬删除任务)
result = self.target_service.delete_targets_two_phase([target.id])
return Response({
'message': f'已删除目标: {target.name}',
return success_response(data={
'targetId': target.id,
'targetName': target.name,
'deletedCount': result['soft_deleted_count']
}, status=200)
})
except Target.DoesNotExist:
raise NotFound('目标不存在')
@@ -330,11 +327,10 @@ class TargetViewSet(viewsets.ModelViewSet):
# 调用 Service 层的业务方法(软删除 + 分发硬删除任务)
result = self.target_service.delete_targets_two_phase(ids)
return Response({
'message': f"已删除 {result['soft_deleted_count']} 个目标",
return success_response(data={
'deletedCount': result['soft_deleted_count'],
'deletedTargets': result['target_names']
}, status=200)
})
except ValueError as e:
raise NotFound(str(e))
@@ -389,7 +385,7 @@ class TargetViewSet(viewsets.ModelViewSet):
raise ValidationError(str(e))
# 3. 返回响应
return Response(result, status=status.HTTP_201_CREATED)
return success_response(data=result, status_code=status.HTTP_201_CREATED)
# subdomains action 已迁移到 SubdomainViewSet 嵌套路由
# GET /api/targets/{id}/subdomains/ -> SubdomainViewSet

View File

@@ -177,6 +177,10 @@ STATIC_URL = 'static/'
DEFAULT_AUTO_FIELD = 'django.db.models.BigAutoField'
# ==================== Worker API Key 配置 ====================
# Worker 节点认证密钥(从环境变量读取)
WORKER_API_KEY = os.environ.get('WORKER_API_KEY', '')
# ==================== REST Framework 配置 ====================
REST_FRAMEWORK = {
'DEFAULT_PAGINATION_CLASS': 'apps.common.pagination.BasePagination', # 使用基础分页器
@@ -186,6 +190,14 @@ REST_FRAMEWORK = {
'apps.common.authentication.CsrfExemptSessionAuthentication',
],
# 全局权限配置:默认需要认证,公开端点和 Worker 端点在权限类中单独处理
'DEFAULT_PERMISSION_CLASSES': [
'apps.common.permissions.IsAuthenticatedOrPublic',
],
# 自定义异常处理器:统一 401/403 错误响应格式
'EXCEPTION_HANDLER': 'apps.common.exception_handlers.custom_exception_handler',
# JSON 命名格式转换:后端 snake_case ↔ 前端 camelCase
'DEFAULT_RENDERER_CLASSES': (
'djangorestframework_camel_case.render.CamelCaseJSONRenderer', # 响应数据转换为 camelCase
@@ -278,8 +290,9 @@ ENABLE_COMMAND_LOGGING = get_bool_env('ENABLE_COMMAND_LOGGING', True)
# ==================== 数据目录配置(统一使用 /opt/xingrin ====================
# 所有数据目录统一挂载到 /opt/xingrin便于管理和备份
# 扫描工具基础路径
SCAN_TOOLS_BASE_PATH = os.getenv('SCAN_TOOLS_PATH', '/opt/xingrin/tools')
# 扫描工具基础路径worker 容器内,符合 FHS 标准)
# 使用 /opt/xingrin-tools/bin 隔离项目专用扫描工具,避免与系统工具或 Python 包冲突
SCAN_TOOLS_BASE_PATH = os.getenv('SCAN_TOOLS_PATH', '/opt/xingrin-tools/bin')
# 字典文件基础路径
WORDLISTS_BASE_PATH = os.getenv('WORDLISTS_PATH', '/opt/xingrin/wordlists')
@@ -297,9 +310,6 @@ NUCLEI_PUBLIC_TEMPLATES_DIR = os.getenv('NUCLEI_PUBLIC_TEMPLATES_DIR', '/opt/xin
# Nuclei 官方模板仓库地址
NUCLEI_TEMPLATES_REPO_URL = os.getenv('NUCLEI_TEMPLATES_REPO_URL', 'https://github.com/projectdiscovery/nuclei-templates.git')
# Git 镜像加速配置(用于加速 GitHub 仓库克隆)
GIT_MIRROR = os.getenv('GIT_MIRROR', '').strip()
# 对外访问主机与端口(供 Worker 访问 Django 使用)
PUBLIC_HOST = os.getenv('PUBLIC_HOST', 'localhost').strip()
PUBLIC_PORT = os.getenv('PUBLIC_PORT', '8083').strip() # 对外 HTTPS 端口
@@ -357,25 +367,16 @@ HOST_WORDLISTS_DIR = '/opt/xingrin/wordlists'
# ============================================
# Worker 配置中心(任务容器从 /api/workers/config/ 获取)
# ============================================
# Worker 数据库/Redis 地址由 worker_views.py 的 config API 动态返回
# Worker 数据库地址由 worker_views.py 的 config API 动态返回
# 根据请求来源(本地/远程)返回不同的配置:
# - 本地 WorkerDocker 网络内):使用内部服务名postgres, redis
# - 本地 WorkerDocker 网络内):使用内部服务名 postgres
# - 远程 Worker公网访问使用 PUBLIC_HOST
#
# 以下变量仅作为备用/兼容配置,实际配置由 API 动态生成
# 注意Redis 仅在 Server 容器内使用Worker 不需要直接连接 Redis
_db_host = DATABASES['default']['HOST']
_is_internal_db = _db_host in ('postgres', 'localhost', '127.0.0.1')
WORKER_DB_HOST = os.getenv('WORKER_DB_HOST', _db_host)
# 远程 Worker 访问 Redis 的地址(自动推导)
# - 如果 PUBLIC_HOST 是外部 IP → 使用 PUBLIC_HOST
# - 如果 PUBLIC_HOST 是 Docker 内部名 → 使用 redis本地部署
_is_internal_public = PUBLIC_HOST in ('server', 'localhost', '127.0.0.1')
WORKER_REDIS_URL = os.getenv(
'WORKER_REDIS_URL',
'redis://redis:6379/0' if _is_internal_public else f'redis://{PUBLIC_HOST}:6379/0'
)
# 容器内挂载目标路径(统一使用 /opt/xingrin
CONTAINER_RESULTS_MOUNT = '/opt/xingrin/results'
CONTAINER_LOGS_MOUNT = '/opt/xingrin/logs'

View File

@@ -16,7 +16,6 @@ Including another URLconf
"""
from django.contrib import admin
from django.urls import path, include
from rest_framework import permissions
from drf_yasg.views import get_schema_view
from drf_yasg import openapi
@@ -30,7 +29,6 @@ schema_view = get_schema_view(
description="Web 应用侦察工具 API 文档",
),
public=True,
permission_classes=(permissions.AllowAny,),
)
urlpatterns = [

File diff suppressed because it is too large Load Diff

View File

@@ -28,6 +28,10 @@ uvicorn[standard]==0.30.1
# SSH & 远程部署
paramiko>=3.0.0
# Docker 管理
docker>=6.0.0 # Python Docker SDK
packaging>=21.0 # 版本比较
# 测试框架
pytest==8.0.0
pytest-django==4.7.0
@@ -37,6 +41,7 @@ python-dateutil==2.9.0
pytz==2024.1
validators==0.22.0
PyYAML==6.0.1
ruamel.yaml>=0.18.0 # 保留注释的 YAML 解析
colorlog==6.8.2 # 彩色日志输出
python-json-logger==2.0.7 # JSON 结构化日志
Jinja2>=3.1.6 # 命令模板引擎

View File

@@ -180,6 +180,28 @@ def get_db_config() -> dict:
}
def generate_raw_response_headers(headers_dict: dict) -> str:
"""
将响应头字典转换为原始 HTTP 响应头字符串格式
Args:
headers_dict: 响应头字典
Returns:
原始 HTTP 响应头字符串,格式如:
HTTP/1.1 200 OK
Server: nginx
Content-Type: text/html
...
"""
lines = ['HTTP/1.1 200 OK']
for key, value in headers_dict.items():
# 将下划线转换为连字符,并首字母大写
header_name = key.replace('_', '-').title()
lines.append(f'{header_name}: {value}')
return '\r\n'.join(lines)
DB_CONFIG = get_db_config()
@@ -222,6 +244,9 @@ class TestDataGenerator:
self.create_ehole_fingerprints()
self.create_goby_fingerprints()
self.create_wappalyzer_fingerprints()
self.create_fingers_fingerprints()
self.create_fingerprinthub_fingerprints()
self.create_arl_fingerprints()
self.conn.commit()
print("\n✅ 测试数据生成完成!")
@@ -238,6 +263,7 @@ class TestDataGenerator:
tables = [
# 指纹表
'ehole_fingerprint', 'goby_fingerprint', 'wappalyzer_fingerprint',
'fingers_fingerprint', 'fingerprinthub_fingerprint', 'arl_fingerprint',
# 快照表(先删除,因为有外键依赖 scan)
'vulnerability_snapshot', 'host_port_mapping_snapshot', 'directory_snapshot',
'endpoint_snapshot', 'website_snapshot', 'subdomain_snapshot',
@@ -544,6 +570,10 @@ class TestDataGenerator:
'Authentication failed for protected resources.',
]
# 获取引擎名称映射
cur.execute("SELECT id, name FROM scan_engine WHERE id = ANY(%s)", (engine_ids,))
engine_name_map = {row[0]: row[1] for row in cur.fetchall()}
ids = []
# 随机选择目标数量 - 增加到 80-120 个
num_targets = min(random.randint(80, 120), len(target_ids))
@@ -554,7 +584,10 @@ class TestDataGenerator:
num_scans = random.randint(3, 15)
for _ in range(num_scans):
status = random.choices(statuses, weights=status_weights)[0]
engine_id = random.choice(engine_ids)
# 随机选择 1-3 个引擎
num_engines = random.randint(1, min(3, len(engine_ids)))
selected_engine_ids = random.sample(engine_ids, num_engines)
selected_engine_names = [engine_name_map.get(eid, f'Engine-{eid}') for eid in selected_engine_ids]
worker_id = random.choice(worker_ids) if worker_ids else None
progress = random.randint(10, 95) if status == 'running' else (100 if status == 'completed' else random.randint(0, 50))
@@ -577,20 +610,20 @@ class TestDataGenerator:
cur.execute("""
INSERT INTO scan (
target_id, engine_id, status, worker_id, progress, current_stage,
target_id, engine_ids, engine_names, merged_configuration, status, worker_id, progress, current_stage,
results_dir, error_message, container_ids, stage_progress,
cached_subdomains_count, cached_websites_count, cached_endpoints_count,
cached_ips_count, cached_directories_count, cached_vulns_total,
cached_vulns_critical, cached_vulns_high, cached_vulns_medium, cached_vulns_low,
created_at, stopped_at, deleted_at
) VALUES (
%s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
%s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
NOW() - INTERVAL '%s days', %s, NULL
)
RETURNING id
""", (
target_id, engine_id, status, worker_id, progress, stage,
target_id, selected_engine_ids, json.dumps(selected_engine_names), '', status, worker_id, progress, stage,
f'/app/results/scan_{target_id}_{random.randint(1000, 9999)}', error_msg, '{}', '{}',
subdomains, websites, endpoints, ips, directories, vulns_total,
vulns_critical, vulns_high, vulns_medium, vulns_low,
@@ -647,6 +680,10 @@ class TestDataGenerator:
num_schedules = random.randint(40, 50)
selected = random.sample(schedule_templates, min(num_schedules, len(schedule_templates)))
# 获取引擎名称映射
cur.execute("SELECT id, name FROM scan_engine WHERE id = ANY(%s)", (engine_ids,))
engine_name_map = {row[0]: row[1] for row in cur.fetchall()}
count = 0
for name_base, cron_template in selected:
name = f'{name_base}-{suffix}-{count:02d}'
@@ -658,7 +695,11 @@ class TestDataGenerator:
)
enabled = random.random() > 0.3 # 70% 启用
engine_id = random.choice(engine_ids)
# 随机选择 1-3 个引擎
num_engines = random.randint(1, min(3, len(engine_ids)))
selected_engine_ids = random.sample(engine_ids, num_engines)
selected_engine_names = [engine_name_map.get(eid, f'Engine-{eid}') for eid in selected_engine_ids]
# 随机决定关联组织还是目标
if org_ids and target_ids:
if random.random() > 0.5:
@@ -682,12 +723,12 @@ class TestDataGenerator:
cur.execute("""
INSERT INTO scheduled_scan (
name, engine_id, organization_id, target_id, cron_expression, is_enabled,
name, engine_ids, engine_names, merged_configuration, organization_id, target_id, cron_expression, is_enabled,
run_count, last_run_time, next_run_time, created_at, updated_at
) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, NOW() - INTERVAL '%s days', NOW())
) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW() - INTERVAL '%s days', NOW())
ON CONFLICT DO NOTHING
""", (
name, engine_id, org_id, target_id, cron, enabled,
name, selected_engine_ids, json.dumps(selected_engine_names), '', org_id, target_id, cron, enabled,
run_count if has_run else 0,
datetime.now() - timedelta(days=random.randint(0, 14), hours=random.randint(0, 23)) if has_run else None,
datetime.now() + timedelta(hours=random.randint(1, 336)) # 最多 2 周后
@@ -808,7 +849,7 @@ class TestDataGenerator:
]
# 真实的 body preview 内容
body_previews = [
response_bodies = [
'<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><title>Login - Enterprise Portal</title><link rel="stylesheet" href="/assets/css/main.css"></head><body><div id="app"></div><script src="/assets/js/bundle.js"></script></body></html>',
'<!DOCTYPE html><html><head><title>Dashboard</title><meta name="description" content="Enterprise management dashboard for monitoring and analytics"><link rel="icon" href="/favicon.ico"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>',
'{"status":"ok","version":"2.4.1","environment":"production","timestamp":"2024-12-22T10:30:00Z","services":{"database":"healthy","cache":"healthy","queue":"healthy"},"uptime":864000}',
@@ -839,14 +880,27 @@ class TestDataGenerator:
# 生成固定 245 长度的 URL
url = generate_fixed_length_url(target_name, length=245, path_hint=f'website/{i:04d}')
# 生成模拟的响应头数据
response_headers = {
'server': random.choice(['nginx', 'Apache', 'cloudflare', 'Microsoft-IIS/10.0']),
'content_type': 'text/html; charset=utf-8',
'x_powered_by': random.choice(['PHP/8.2', 'ASP.NET', 'Express', None]),
'x_frame_options': random.choice(['DENY', 'SAMEORIGIN', None]),
'strict_transport_security': 'max-age=31536000; includeSubDomains' if random.choice([True, False]) else None,
'set_cookie': f'session={random.randint(100000, 999999)}; HttpOnly; Secure' if random.choice([True, False]) else None,
}
# 移除 None 值
response_headers = {k: v for k, v in response_headers.items() if v is not None}
batch_data.append((
url, target_id, target_name, random.choice(titles),
random.choice(webservers), random.choice(tech_stacks),
random.choice([200, 301, 302, 403, 404]),
random.randint(1000, 500000), 'text/html; charset=utf-8',
f'https://{target_name}/login' if random.choice([True, False]) else '',
random.choice(body_previews),
random.choice([True, False, None])
random.choice(response_bodies),
random.choice([True, False, None]),
generate_raw_response_headers(response_headers)
))
# 批量插入
@@ -855,12 +909,12 @@ class TestDataGenerator:
execute_values(cur, """
INSERT INTO website (
url, target_id, host, title, webserver, tech, status_code,
content_length, content_type, location, body_preview, vhost,
created_at
content_length, content_type, location, response_body, vhost,
response_headers, created_at
) VALUES %s
ON CONFLICT DO NOTHING
RETURNING id
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
ids = [row[0] for row in cur.fetchall()]
print(f" ✓ 创建了 {len(batch_data)} 个网站\n")
@@ -961,7 +1015,7 @@ class TestDataGenerator:
]
# 真实的 API 响应 body preview
body_previews = [
response_bodies = [
'{"status":"success","data":{"user_id":12345,"username":"john_doe","email":"john@example.com","role":"user","created_at":"2024-01-15T10:30:00Z","last_login":"2024-12-22T08:45:00Z"}}',
'{"success":true,"message":"Authentication successful","token":"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c","expires_in":3600}',
'{"error":"Unauthorized","code":"AUTH_FAILED","message":"Invalid credentials provided. Please check your username and password.","timestamp":"2024-12-22T15:30:45.123Z","request_id":"req_abc123xyz"}',
@@ -1013,14 +1067,27 @@ class TestDataGenerator:
# 生成 10-20 个 tags (gf_patterns)
tags = random.choice(gf_patterns)
# 生成模拟的响应头数据
response_headers = {
'server': random.choice(['nginx', 'gunicorn', 'uvicorn', 'Apache']),
'content_type': 'application/json',
'x_request_id': f'req_{random.randint(100000, 999999)}',
'x_ratelimit_limit': str(random.choice([100, 1000, 5000])),
'x_ratelimit_remaining': str(random.randint(0, 1000)),
'cache_control': random.choice(['no-cache', 'max-age=3600', 'private', None]),
}
# 移除 None 值
response_headers = {k: v for k, v in response_headers.items() if v is not None}
batch_data.append((
url, target_id, target_name, title,
random.choice(['nginx/1.24.0', 'gunicorn/21.2.0']),
random.choice([200, 201, 301, 400, 401, 403, 404, 500]),
random.randint(100, 50000), 'application/json',
tech_list,
'', random.choice(body_previews),
random.choice([True, False, None]), tags
'', random.choice(response_bodies),
random.choice([True, False, None]), tags,
generate_raw_response_headers(response_headers)
))
count += 1
@@ -1029,11 +1096,11 @@ class TestDataGenerator:
execute_values(cur, """
INSERT INTO endpoint (
url, target_id, host, title, webserver, status_code, content_length,
content_type, tech, location, body_preview, vhost, matched_gf_patterns,
created_at
content_type, tech, location, response_body, vhost, matched_gf_patterns,
response_headers, created_at
) VALUES %s
ON CONFLICT DO NOTHING
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
print(f" ✓ 创建了 {count} 个端点\n")
@@ -1397,13 +1464,23 @@ class TestDataGenerator:
# 生成固定 245 长度的 URL
url = generate_fixed_length_url(target_name, length=245, path_hint=f'website-snap/{i:04d}')
# 生成模拟的响应头数据
response_headers = {
'server': random.choice(['nginx', 'Apache', 'cloudflare']),
'content_type': 'text/html; charset=utf-8',
'x_frame_options': random.choice(['DENY', 'SAMEORIGIN', None]),
}
# 移除 None 值
response_headers = {k: v for k, v in response_headers.items() if v is not None}
batch_data.append((
scan_id, url, target_name, random.choice(titles),
random.choice(webservers), random.choice(tech_stacks),
random.choice([200, 301, 403]),
random.randint(1000, 50000), 'text/html; charset=utf-8',
'', # location 字段
'<!DOCTYPE html><html><head><title>Test</title></head><body>Content</body></html>'
'<!DOCTYPE html><html><head><title>Test</title></head><body>Content</body></html>',
generate_raw_response_headers(response_headers)
))
count += 1
@@ -1412,10 +1489,11 @@ class TestDataGenerator:
execute_values(cur, """
INSERT INTO website_snapshot (
scan_id, url, host, title, web_server, tech, status,
content_length, content_type, location, body_preview, created_at
content_length, content_type, location, response_body,
response_headers, created_at
) VALUES %s
ON CONFLICT DO NOTHING
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
print(f" ✓ 创建了 {count} 个网站快照\n")
@@ -1494,6 +1572,13 @@ class TestDataGenerator:
num_tags = random.randint(10, 20)
tags = random.sample(all_tags, min(num_tags, len(all_tags)))
# 生成模拟的响应头数据
response_headers = {
'server': 'nginx/1.24.0',
'content_type': 'application/json',
'x_request_id': f'req_{random.randint(100000, 999999)}',
}
batch_data.append((
scan_id, url, target_name, title,
random.choice([200, 201, 401, 403, 404]),
@@ -1502,7 +1587,8 @@ class TestDataGenerator:
'nginx/1.24.0',
'application/json', tech_list,
'{"status":"ok","data":{}}',
tags
tags,
generate_raw_response_headers(response_headers)
))
count += 1
@@ -1511,11 +1597,11 @@ class TestDataGenerator:
execute_values(cur, """
INSERT INTO endpoint_snapshot (
scan_id, url, host, title, status_code, content_length,
location, webserver, content_type, tech, body_preview,
matched_gf_patterns, created_at
location, webserver, content_type, tech, response_body,
matched_gf_patterns, response_headers, created_at
) VALUES %s
ON CONFLICT DO NOTHING
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
print(f" ✓ 创建了 {count} 个端点快照\n")
@@ -1990,6 +2076,371 @@ class TestDataGenerator:
print(f" ✓ 创建了 {count} 个 Wappalyzer 指纹\n")
def create_fingers_fingerprints(self):
"""创建 Fingers 指纹数据"""
print("🔍 创建 Fingers 指纹...")
cur = self.conn.cursor()
# 应用名称模板(长名称)
name_templates = [
'Apache-HTTP-Server-Web-Application-Platform-Open-Source-Software',
'Nginx-High-Performance-Web-Server-Reverse-Proxy-Load-Balancer',
'Microsoft-IIS-Internet-Information-Services-Windows-Web-Server',
'Tomcat-Java-Servlet-Container-Apache-Application-Server-Platform',
'WordPress-Content-Management-System-Blogging-Platform-PHP-MySQL',
'Drupal-CMS-Content-Management-Framework-PHP-Community-Platform',
'Joomla-Open-Source-CMS-Web-Content-Management-System-Framework',
'Laravel-PHP-Framework-Web-Application-Development-MVC-Pattern',
'Django-Python-Web-Framework-High-Level-MTV-Architecture-Pattern',
'Ruby-on-Rails-Web-Application-Framework-MVC-Convention-Configuration',
'Express-JS-Node-JS-Web-Application-Framework-Minimal-Flexible',
'Spring-Boot-Java-Framework-Microservices-Enterprise-Application',
'ASP-NET-Core-Cross-Platform-Web-Framework-Microsoft-Open-Source',
'React-JavaScript-Library-Building-User-Interfaces-Facebook-Meta',
'Vue-JS-Progressive-JavaScript-Framework-Web-Application-Development',
'Angular-TypeScript-Platform-Framework-Web-Applications-Google',
'jQuery-JavaScript-Library-DOM-Manipulation-Ajax-Event-Handling',
'Bootstrap-CSS-Framework-Responsive-Mobile-First-Web-Development',
'Tailwind-CSS-Utility-First-Framework-Rapid-UI-Development-Tool',
'Docker-Container-Platform-Application-Deployment-Virtualization',
'Kubernetes-Container-Orchestration-Platform-Cloud-Native-Apps',
'Redis-In-Memory-Data-Structure-Store-Database-Cache-Broker',
'MongoDB-Document-NoSQL-Database-Scalable-High-Performance',
'PostgreSQL-Relational-Database-Management-System-Open-Source',
'MySQL-Database-Management-System-Relational-Database-Oracle',
'Elasticsearch-Search-Analytics-Engine-Distributed-RESTful-API',
'RabbitMQ-Message-Broker-Advanced-Message-Queuing-Protocol',
'Jenkins-Automation-Server-Continuous-Integration-Deployment',
'GitLab-DevOps-Platform-Git-Repository-CI-CD-Pipeline-Management',
'Grafana-Observability-Platform-Metrics-Visualization-Dashboard',
]
# 标签模板
tag_options = [
['web-server', 'http', 'apache', 'linux'],
['web-server', 'reverse-proxy', 'nginx', 'high-performance'],
['web-server', 'windows', 'microsoft', 'iis'],
['cms', 'php', 'wordpress', 'blog', 'mysql'],
['cms', 'php', 'drupal', 'content-management'],
['framework', 'php', 'laravel', 'mvc', 'modern'],
['framework', 'python', 'django', 'full-stack'],
['framework', 'ruby', 'rails', 'mvc', 'convention'],
['framework', 'javascript', 'nodejs', 'express', 'backend'],
['framework', 'java', 'spring', 'enterprise', 'microservices'],
['framework', 'dotnet', 'aspnet', 'microsoft', 'cross-platform'],
['library', 'javascript', 'react', 'frontend', 'ui'],
['framework', 'javascript', 'vue', 'progressive', 'reactive'],
['framework', 'typescript', 'angular', 'google', 'spa'],
['database', 'nosql', 'mongodb', 'document', 'json'],
['database', 'relational', 'postgresql', 'sql', 'open-source'],
['database', 'relational', 'mysql', 'sql', 'oracle'],
['cache', 'database', 'redis', 'in-memory', 'key-value'],
['search', 'analytics', 'elasticsearch', 'distributed', 'restful'],
['container', 'docker', 'virtualization', 'deployment'],
]
# 规则模板
rule_templates = [
# favicon hash 规则
[{'method': 'faviconhash', 'favicon': f'-{random.randint(1000000000, 9999999999)}'}],
# keyword 规则
[{'method': 'keyword', 'keyword': ['X-Powered-By', 'Server', 'X-Generator']}],
# 混合规则
[
{'method': 'keyword', 'keyword': ['content="WordPress', 'wp-content/', 'wp-includes/']},
{'method': 'faviconhash', 'favicon': f'-{random.randint(1000000000, 9999999999)}'}
],
# header 规则
[{'method': 'keyword', 'keyword': ['Server: nginx', 'X-Powered-By: PHP']}],
# body 规则
[{'method': 'keyword', 'keyword': ['<meta name="generator"', 'Powered by', 'Built with']}],
]
# 端口模板
port_options = [
[80, 443],
[80, 443, 8080, 8443],
[80, 443, 8000, 8080, 8443],
[3000, 3001, 5000],
[8080, 8081, 8888, 9000],
[443, 8443, 9443],
[], # 空数组
]
count = 0
batch_data = []
for i in range(200): # 生成 200 条 Fingers 指纹
name = f'{random.choice(name_templates)}-{random.randint(1000, 9999)}'
link = f'https://www.example-{random.randint(1000, 9999)}.com'
rule = random.choice(rule_templates)
tag = random.choice(tag_options)
focus = random.choice([True, False])
default_port = random.choice(port_options)
batch_data.append((
name, link, json.dumps(rule), json.dumps(tag), focus, json.dumps(default_port)
))
count += 1
if batch_data:
execute_values(cur, """
INSERT INTO fingers_fingerprint (name, link, rule, tag, focus, default_port, created_at)
VALUES %s
ON CONFLICT (name) DO NOTHING
""", batch_data, template="(%s, %s, %s, %s, %s, %s, NOW())")
print(f" ✓ 创建了 {count} 个 Fingers 指纹\n")
def create_fingerprinthub_fingerprints(self):
"""创建 FingerPrintHub 指纹数据"""
print("🔍 创建 FingerPrintHub 指纹...")
cur = self.conn.cursor()
# FP ID 前缀
fp_id_prefixes = [
'web', 'cms', 'framework', 'server', 'database', 'cache', 'cdn',
'waf', 'load-balancer', 'proxy', 'api', 'admin', 'monitoring'
]
# 应用名称模板
name_templates = [
'Apache-HTTP-Server-Detection-Web-Platform-Fingerprint',
'Nginx-Web-Server-Identification-Reverse-Proxy-Detection',
'WordPress-CMS-Detection-Content-Management-System-Fingerprint',
'Drupal-CMS-Identification-Web-Content-Platform-Detection',
'Joomla-CMS-Detection-Web-Content-Management-Framework',
'Laravel-Framework-Detection-PHP-Web-Application-Platform',
'Django-Framework-Identification-Python-Web-Framework-Detection',
'Spring-Boot-Framework-Detection-Java-Enterprise-Application',
'React-Library-Detection-JavaScript-UI-Framework-Fingerprint',
'Vue-JS-Framework-Detection-Progressive-JavaScript-Platform',
'Angular-Framework-Identification-TypeScript-Web-Platform',
'Docker-Container-Detection-Virtualization-Platform-Fingerprint',
'Kubernetes-Orchestration-Detection-Container-Management-Platform',
'Redis-Cache-Detection-In-Memory-Database-Fingerprint',
'MongoDB-Database-Detection-NoSQL-Document-Store-Platform',
'PostgreSQL-Database-Detection-Relational-Database-System',
'MySQL-Database-Detection-Relational-Database-Management',
'Elasticsearch-Search-Detection-Analytics-Engine-Platform',
'Jenkins-CI-CD-Detection-Automation-Server-Platform',
'GitLab-DevOps-Detection-Version-Control-Platform-System',
'Grafana-Monitoring-Detection-Observability-Platform-Dashboard',
'Prometheus-Monitoring-Detection-Time-Series-Database-System',
'Kibana-Visualization-Detection-Data-Dashboard-Platform',
'Cloudflare-CDN-Detection-Web-Application-Firewall-Platform',
'Akamai-CDN-Detection-Content-Delivery-Network-Platform',
'AWS-CloudFront-CDN-Detection-Amazon-Web-Services-Platform',
'Microsoft-IIS-Detection-Internet-Information-Services-Server',
'Tomcat-Server-Detection-Java-Servlet-Container-Platform',
'JBoss-Server-Detection-Enterprise-Application-Platform',
'WebLogic-Server-Detection-Oracle-Application-Server-Platform',
]
# 作者模板
authors = [
'security-research-team', 'fingerprint-detection-group', 'web-security-lab',
'cyber-threat-intelligence', 'vulnerability-research-team', 'security-automation-team',
'open-source-security', 'community-contributors', 'detection-engineering-team'
]
# 严重程度
severities = ['info', 'low', 'medium', 'high', 'critical']
# metadata 模板
metadata_templates = [
{
'vendor': 'Apache Software Foundation',
'product': 'Apache HTTP Server',
'verified': True,
'max-request': 1,
'shodan-query': 'http.server:"Apache"'
},
{
'vendor': 'Nginx Inc',
'product': 'Nginx Web Server',
'verified': True,
'max-request': 1,
'shodan-query': 'http.server:"nginx"'
},
{
'vendor': 'WordPress',
'product': 'WordPress CMS',
'verified': True,
'max-request': 2,
'fofa-query': 'body="wp-content"'
},
{
'vendor': 'Various',
'product': 'Web Framework',
'verified': False,
'max-request': 1
},
]
# HTTP 规则模板
http_templates = [
[{
'method': 'GET',
'path': ['{{BaseURL}}'],
'matchers': [{
'type': 'word',
'words': ['Server: nginx', 'X-Powered-By'],
'condition': 'or'
}]
}],
[{
'method': 'GET',
'path': ['{{BaseURL}}/admin'],
'matchers': [{
'type': 'status',
'status': [200, 401, 403]
}]
}],
[{
'method': 'GET',
'path': ['{{BaseURL}}'],
'matchers': [{
'type': 'word',
'words': ['wp-content', 'wordpress'],
'part': 'body',
'condition': 'and'
}]
}],
]
# source_file 模板
source_files = [
'fingerprints/web-servers/apache.yaml',
'fingerprints/web-servers/nginx.yaml',
'fingerprints/cms/wordpress.yaml',
'fingerprints/cms/drupal.yaml',
'fingerprints/frameworks/laravel.yaml',
'fingerprints/frameworks/django.yaml',
'fingerprints/frameworks/spring.yaml',
'fingerprints/databases/mongodb.yaml',
'fingerprints/databases/postgresql.yaml',
'fingerprints/cache/redis.yaml',
]
count = 0
batch_data = []
for i in range(200): # 生成 200 条 FingerPrintHub 指纹
fp_id = f'{random.choice(fp_id_prefixes)}-detection-{random.randint(10000, 99999)}'
name = f'{random.choice(name_templates)}-{random.randint(1000, 9999)}'
author = random.choice(authors)
tags = ','.join(random.sample(['web', 'cms', 'framework', 'server', 'detection', 'fingerprint'], random.randint(2, 4)))
severity = random.choice(severities)
metadata = random.choice(metadata_templates).copy()
http = random.choice(http_templates)
source_file = random.choice(source_files)
batch_data.append((
fp_id, name, author, tags, severity,
json.dumps(metadata), json.dumps(http), source_file
))
count += 1
if batch_data:
execute_values(cur, """
INSERT INTO fingerprinthub_fingerprint (
fp_id, name, author, tags, severity, metadata, http, source_file, created_at
)
VALUES %s
ON CONFLICT (fp_id) DO NOTHING
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, NOW())")
print(f" ✓ 创建了 {count} 个 FingerPrintHub 指纹\n")
def create_arl_fingerprints(self):
"""创建 ARL 指纹数据"""
print("🔍 创建 ARL 指纹...")
cur = self.conn.cursor()
# 应用名称模板
name_templates = [
'Apache-HTTP-Server-Web-Platform-Application-Server',
'Nginx-High-Performance-Web-Server-Reverse-Proxy',
'Microsoft-IIS-Internet-Information-Services-Server',
'WordPress-Content-Management-System-Blogging-Platform',
'Drupal-Open-Source-CMS-Content-Management-Framework',
'Joomla-Web-Content-Management-System-Framework',
'Laravel-PHP-Web-Application-Framework-MVC-Pattern',
'Django-Python-Web-Framework-MTV-Architecture',
'Spring-Boot-Java-Enterprise-Application-Framework',
'Express-Node-JS-Web-Application-Framework-Minimal',
'React-JavaScript-Library-User-Interface-Components',
'Vue-JS-Progressive-JavaScript-Framework-Reactive',
'Angular-TypeScript-Web-Application-Framework-Google',
'Docker-Container-Platform-Application-Deployment',
'Kubernetes-Container-Orchestration-Cloud-Native',
'Redis-In-Memory-Database-Cache-Message-Broker',
'MongoDB-Document-NoSQL-Database-Scalable-Platform',
'PostgreSQL-Relational-Database-Management-System',
'MySQL-Database-Management-Relational-Database-Oracle',
'Elasticsearch-Search-Analytics-Engine-Distributed',
'Jenkins-Automation-Server-Continuous-Integration',
'GitLab-DevOps-Platform-Git-Repository-CI-CD-Pipeline',
'Grafana-Observability-Metrics-Visualization-Dashboard',
'Prometheus-Monitoring-Time-Series-Database-Alerting',
'RabbitMQ-Message-Broker-AMQP-Protocol-Queue-System',
'Tomcat-Java-Servlet-Container-Application-Server',
'JBoss-Enterprise-Application-Platform-Java-EE-Server',
'WebLogic-Oracle-Application-Server-Java-Enterprise',
'Cloudflare-CDN-DDoS-Protection-Web-Firewall-Platform',
'Amazon-CloudFront-CDN-Content-Delivery-Network-AWS',
]
# 规则表达式模板
rule_templates = [
# 简单规则
'header="Server" && header="nginx"',
'body="WordPress" && body="wp-content"',
'title="Admin Panel" || title="Dashboard"',
'header="X-Powered-By" && header="PHP"',
'body="Powered by" && body="Laravel"',
# 复杂规则
'(header="Server" && header="Apache") || (body="Apache" && title="Apache")',
'(body="wp-content" && body="wp-includes") || (header="X-Powered-By" && header="WordPress")',
'(title="Jenkins" && body="Jenkins") || (header="X-Jenkins" && status="200")',
'(body="Spring" && body="Whitelabel Error Page") || header="X-Application-Context"',
'(body="React" && body="react-dom") || (body="__REACT" && body="reactRoot")',
# 带状态码规则
'status="200" && body="nginx" && title="Welcome to nginx"',
'status="403" && body="Apache" && header="Server"',
'status="401" && header="WWW-Authenticate" && body="Unauthorized"',
# 多条件规则
'header="Server" && (body="PHP" || body="Laravel" || body="Symfony")',
'body="Django" && (header="X-Frame-Options" || body="csrfmiddlewaretoken")',
'(title="GitLab" && body="gitlab") || (header="X-GitLab-Feature-Category")',
# JSON API 规则
'body="{\\"version\\"" && body="api" && header="Content-Type"',
'status="200" && body="swagger" && body="openapi"',
# 错误页面规则
'status="404" && body="Not Found" && body="nginx"',
'status="500" && body="Internal Server Error" && body="Apache"',
]
count = 0
batch_data = []
for i in range(200): # 生成 200 条 ARL 指纹
name = f'{random.choice(name_templates)}-{random.randint(1000, 9999)}'
rule = random.choice(rule_templates)
batch_data.append((name, rule))
count += 1
if batch_data:
execute_values(cur, """
INSERT INTO arl_fingerprint (name, rule, created_at)
VALUES %s
ON CONFLICT (name) DO NOTHING
""", batch_data, template="(%s, %s, NOW())")
print(f" ✓ 创建了 {count} 个 ARL 指纹\n")
class MillionDataGenerator:
"""
@@ -2174,9 +2625,10 @@ class MillionDataGenerator:
if len(batch_data) >= batch_size:
execute_values(cur, """
INSERT INTO website (url, target_id, host, title, webserver, tech,
status_code, content_length, content_type, location, body_preview, created_at)
status_code, content_length, content_type, location, response_body,
vhost, response_headers, created_at)
VALUES %s ON CONFLICT DO NOTHING
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NULL, '', NOW())")
self.conn.commit()
batch_data = []
print(f"{count:,} / {target_count:,}")
@@ -2186,9 +2638,10 @@ class MillionDataGenerator:
if batch_data:
execute_values(cur, """
INSERT INTO website (url, target_id, host, title, webserver, tech,
status_code, content_length, content_type, location, body_preview, created_at)
status_code, content_length, content_type, location, response_body,
vhost, response_headers, created_at)
VALUES %s ON CONFLICT DO NOTHING
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NULL, '', NOW())")
self.conn.commit()
print(f" ✓ 创建了 {count:,} 个网站\n")
@@ -2262,10 +2715,10 @@ class MillionDataGenerator:
if len(batch_data) >= batch_size:
execute_values(cur, """
INSERT INTO endpoint (url, target_id, host, title, webserver, status_code,
content_length, content_type, tech, location, body_preview, vhost,
matched_gf_patterns, created_at)
content_length, content_type, tech, location, response_body, vhost,
matched_gf_patterns, response_headers, created_at)
VALUES %s ON CONFLICT DO NOTHING
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, '', NOW())")
self.conn.commit()
batch_data = []
print(f"{count:,} / {target_count:,}")
@@ -2275,10 +2728,10 @@ class MillionDataGenerator:
if batch_data:
execute_values(cur, """
INSERT INTO endpoint (url, target_id, host, title, webserver, status_code,
content_length, content_type, tech, location, body_preview, vhost,
matched_gf_patterns, created_at)
content_length, content_type, tech, location, response_body, vhost,
matched_gf_patterns, response_headers, created_at)
VALUES %s ON CONFLICT DO NOTHING
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, NOW())")
""", batch_data, template="(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, '', NOW())")
self.conn.commit()
print(f" ✓ 创建了 {count:,} 个端点\n")

View File

@@ -95,6 +95,7 @@ EOF
RESPONSE=$(curl -k -s -X POST \
-H "Content-Type: application/json" \
-H "X-Worker-API-Key: ${WORKER_API_KEY}" \
-d "$REGISTER_DATA" \
"${API_URL}/api/workers/register/" 2>/dev/null)
@@ -116,7 +117,7 @@ if [ -z "$WORKER_ID" ]; then
# 等待 Server 就绪
log "等待 Server 就绪..."
for i in $(seq 1 30); do
if curl -k -s "${API_URL}/api/" > /dev/null 2>&1; then
if curl -k -s -H "X-Worker-API-Key: ${WORKER_API_KEY}" "${API_URL}/api/workers/config/?is_local=${IS_LOCAL}" > /dev/null 2>&1; then
log "${GREEN}Server 已就绪${NC}"
break
fi
@@ -189,6 +190,7 @@ EOF
RESPONSE_FILE=$(mktemp)
HTTP_CODE=$(curl -k -s -o "$RESPONSE_FILE" -w "%{http_code}" -X POST \
-H "Content-Type: application/json" \
-H "X-Worker-API-Key: ${WORKER_API_KEY}" \
-d "$JSON_DATA" \
"${API_URL}/api/workers/${WORKER_ID}/heartbeat/" 2>/dev/null || echo "000")
RESPONSE_BODY=$(cat "$RESPONSE_FILE" 2>/dev/null)

View File

@@ -30,6 +30,7 @@ IMAGE="${DOCKER_USER}/xingrin-agent:${IMAGE_TAG}"
# 预设变量(远程部署时由 deploy_service.py 替换)
PRESET_SERVER_URL="{{HEARTBEAT_API_URL}}"
PRESET_WORKER_ID="{{WORKER_ID}}"
PRESET_API_KEY="{{WORKER_API_KEY}}"
# 颜色定义
GREEN='\033[0;32m'
@@ -68,6 +69,7 @@ start_agent() {
-e SERVER_URL="${PRESET_SERVER_URL}" \
-e WORKER_ID="${PRESET_WORKER_ID}" \
-e IMAGE_TAG="${IMAGE_TAG}" \
-e WORKER_API_KEY="${PRESET_API_KEY}" \
-v /proc:/host/proc:ro \
${IMAGE}

View File

@@ -9,9 +9,8 @@ DB_USER=postgres
DB_PASSWORD=123.com
# ==================== Redis 配置 ====================
# 在 Docker 网络中Redis 服务名称为 redis
# Redis 仅在 Docker 内部网络使用,不暴露公网端口
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_DB=0
# ==================== 服务端口配置 ====================
@@ -41,6 +40,9 @@ SCAN_RESULTS_DIR=/opt/xingrin/results
# Django 日志目录
# 注意:如果留空或删除此变量,日志将只输出到 Docker 控制台(标准输出),不写入文件
LOG_DIR=/opt/xingrin/logs
# 扫描工具路径(容器内路径,符合 FHS 标准,已隔离避免命名冲突)
# 默认值已在 settings.py 中设置,无需修改,除非需要回退到旧路径
SCAN_TOOLS_PATH=/opt/xingrin-tools/bin
# ==================== 日志级别配置 ====================
# 应用日志级别DEBUG / INFO / WARNING / ERROR
@@ -48,6 +50,12 @@ LOG_LEVEL=INFO
# 是否记录命令执行日志(大量扫描时会增加磁盘占用)
ENABLE_COMMAND_LOGGING=true
# ==================== Worker API Key 配置 ====================
# Worker 节点认证密钥(用于 Worker 与主服务器之间的 API 认证)
# 生产环境务必更换为随机强密钥(建议 32 位以上随机字符串)
# 生成方法: openssl rand -hex 32
WORKER_API_KEY=change-me-to-a-secure-random-key
# ==================== Docker Hub 配置(生产模式) ====================
# 生产模式下从 Docker Hub 拉取镜像时使用
DOCKER_USER=yyhuni

View File

@@ -24,8 +24,6 @@ services:
redis:
image: redis:7-alpine
restart: always
ports:
- "${REDIS_PORT}:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
@@ -49,7 +47,8 @@ services:
- /opt/xingrin:/opt/xingrin
- /var/run/docker.sock:/var/run/docker.sock
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8888/api/"]
# 使用专门的健康检查端点(无需认证)
test: ["CMD", "curl", "-f", "http://localhost:8888/api/health/"]
interval: 30s
timeout: 10s
retries: 3
@@ -65,9 +64,10 @@ services:
restart: always
environment:
- SERVER_URL=http://server:8888
- WORKER_NAME=本地节点
- WORKER_NAME=Local-Worker
- IS_LOCAL=true
- IMAGE_TAG=${IMAGE_TAG:-dev}
- WORKER_API_KEY=${WORKER_API_KEY}
depends_on:
server:
condition: service_healthy

View File

@@ -30,8 +30,6 @@ services:
redis:
image: redis:7-alpine
restart: always
ports:
- "${REDIS_PORT}:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
@@ -52,7 +50,8 @@ services:
# Docker Socket 挂载:允许 Django 服务器执行本地 docker 命令(用于本地 Worker 任务分发)
- /var/run/docker.sock:/var/run/docker.sock
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8888/api/"]
# 使用专门的健康检查端点(无需认证)
test: ["CMD", "curl", "-f", "http://localhost:8888/api/health/"]
interval: 30s
timeout: 10s
retries: 3
@@ -69,9 +68,10 @@ services:
restart: always
environment:
- SERVER_URL=http://server:8888
- WORKER_NAME=本地节点
- WORKER_NAME=Local-Worker
- IS_LOCAL=true
- IMAGE_TAG=${IMAGE_TAG}
- WORKER_API_KEY=${WORKER_API_KEY}
depends_on:
server:
condition: service_healthy

View File

@@ -9,3 +9,12 @@ psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "postgres" <<-EOSQL
GRANT ALL PRIVILEGES ON DATABASE xingrin TO "$POSTGRES_USER";
GRANT ALL PRIVILEGES ON DATABASE xingrin_dev TO "$POSTGRES_USER";
EOSQL
# 启用 pg_trgm 扩展(用于文本模糊搜索索引)
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "xingrin" <<-EOSQL
CREATE EXTENSION IF NOT EXISTS pg_trgm;
EOSQL
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "xingrin_dev" <<-EOSQL
CREATE EXTENSION IF NOT EXISTS pg_trgm;
EOSQL

View File

@@ -165,7 +165,7 @@ main() {
init_engine_config
init_wordlists
# init_fingerprints
init_fingerprints
init_nuclei_templates
init_admin_user

View File

@@ -23,7 +23,7 @@ echo " ✓ 默认目录字典已就绪"
echo " [1.4/3] 初始化默认指纹库..."
# python manage.py init_fingerprints
python manage.py init_fingerprints
echo " ✓ 默认指纹库已就绪"
# 2. 启动 Django uvicorn 服务 (ASGI)

View File

@@ -1,5 +1,6 @@
# 第一阶段:使用 Go 官方镜像编译工具
FROM golang:1.24 AS go-builder
# 锁定 digest 避免上游更新导致缓存失效
FROM golang:1.24@sha256:7e050c14ae9ca5ae56408a288336545b18632f51402ab0ec8e7be0e649a1fc42 AS go-builder
ENV GOPROXY=https://goproxy.cn,direct
# Naabu 需要 CGO 和 libpcap
@@ -36,7 +37,8 @@ RUN CGO_ENABLED=0 go install -v github.com/owasp-amass/amass/v5/cmd/amass@main
RUN go install github.com/hahwul/dalfox/v2@latest
# 第二阶段:运行时镜像
FROM ubuntu:24.04
# 锁定 digest 避免上游更新导致缓存失效
FROM ubuntu:24.04@sha256:4fdf0125919d24aec972544669dcd7d6a26a8ad7e6561c73d5549bd6db258ac2
# 避免交互式提示
ENV DEBIAN_FRONTEND=noninteractive
@@ -71,20 +73,26 @@ RUN pipx install uro && \
pipx install waymore && \
pipx install dnsgen
# 3. 安装 Sublist3r统一放在 /opt/xingrin/tools 下
RUN git clone https://github.com/aboul3la/Sublist3r.git /opt/xingrin/tools/Sublist3r && \
pip3 install --no-cache-dir -r /opt/xingrin/tools/Sublist3r/requirements.txt --break-system-packages
# 3. 安装 Sublist3rPython 脚本工具,放在 /usr/local/share 标准目录
RUN git clone https://github.com/aboul3la/Sublist3r.git /usr/local/share/Sublist3r && \
pip3 install --no-cache-dir -r /usr/local/share/Sublist3r/requirements.txt --break-system-packages
# 4. 从 go-builder 阶段复制 Go 环境和编译好的工具
# 扫描工具统一放在 /opt/xingrin/tools/,避免与 Python 包冲突(如 httpx
# 创建项目专用工具目录(符合 FHS 标准,/opt 用于独立软件包
# 避免与系统工具或 Python 包冲突,避免被 /opt/xingrin 挂载覆盖
RUN mkdir -p /opt/xingrin-tools/bin
ENV GOPATH=/root/go
ENV PATH=/usr/local/go/bin:/opt/xingrin/tools:$PATH:$GOPATH/bin
ENV GOPROXY=https://goproxy.cn,direct
RUN mkdir -p /opt/xingrin/tools
COPY --from=go-builder /usr/local/go /usr/local/go
COPY --from=go-builder /go/bin/* /opt/xingrin/tools/
COPY --from=go-builder /usr/local/bin/massdns /opt/xingrin/tools/massdns
# 从 go-builder 复制扫描工具到专用目录(避免与系统工具或 Python 包冲突)
COPY --from=go-builder /go/bin/* /opt/xingrin-tools/bin/
COPY --from=go-builder /usr/local/bin/massdns /opt/xingrin-tools/bin/massdns
# 将专用工具目录添加到 PATH优先级高于 /usr/local/bin避免冲突
ENV PATH=/opt/xingrin-tools/bin:/usr/local/go/bin:/usr/local/bin:$PATH:$GOPATH/bin
# 5. 安装 uv Python 包管理器)并安装 Python 依赖
COPY backend/requirements.txt .

View File

@@ -13,21 +13,16 @@
- **权限**: sudo 管理员权限
- **端口要求**: 需要开放以下端口
- `8083` - HTTPS 访问(主要访问端口)
- `5432` - PostgreSQL 数据库(如使用本地数据库)
- `6379` - Redis 缓存服务
- `5432` - PostgreSQL 数据库(如使用本地数据库且有远程 Worker
- 后端 API 仅容器内监听 8888由 nginx 反代到 8083对公网无需放行 8888
- Redis 仅在 Docker 内部网络使用,无需对外开放
## 一键安装
### 1. 下载项目
```bash
# 方式 1Git 克隆(推荐)
git clone https://github.com/你的用户名/xingrin.git
cd xingrin
# 方式 2下载 ZIP
wget https://github.com/你的用户名/xingrin/archive/main.zip
unzip main.zip && cd xingrin-main
```
### 2. 执行安装
@@ -60,8 +55,7 @@ sudo ./install.sh --no-frontend
#### 必须放行的端口
```
8083 - HTTPS 访问(主要访问端口)
5432 - PostgreSQL如使用本地数据库
6379 - Redis 缓存
5432 - PostgreSQL如使用本地数据库且有远程 Worker
```
#### 推荐方案
@@ -110,9 +104,6 @@ graph TD
# 重启服务
./restart.sh
# 更新系统
./update.sh
# 卸载系统
./uninstall.sh
```
@@ -234,11 +225,6 @@ docker logs --tail 100 xingrin-agent
tail -f /opt/xingrin/logs/*.log
```
### 3. 定期更新
```bash
# 定期执行系统更新
./update.sh
```
## 下一步

Binary file not shown.

Before

Width:  |  Height:  |  Size: 95 KiB

After

Width:  |  Height:  |  Size: 112 KiB

View File

@@ -24,6 +24,7 @@ import { Suspense } from "react"
import Script from "next/script"
import { QueryProvider } from "@/components/providers/query-provider"
import { ThemeProvider } from "@/components/providers/theme-provider"
import { UiI18nProvider } from "@/components/providers/ui-i18n-provider"
// Import common layout components
import { RoutePrefetch } from "@/components/route-prefetch"
@@ -117,12 +118,15 @@ export default async function LocaleLayout({
<NextIntlClientProvider messages={messages}>
{/* QueryProvider provides React Query functionality */}
<QueryProvider>
{/* Route prefetch */}
<RoutePrefetch />
{/* AuthLayout handles authentication and sidebar display */}
<AuthLayout>
{children}
</AuthLayout>
{/* UiI18nProvider provides UI component translations */}
<UiI18nProvider>
{/* Route prefetch */}
<RoutePrefetch />
{/* AuthLayout handles authentication and sidebar display */}
<AuthLayout>
{children}
</AuthLayout>
</UiI18nProvider>
</QueryProvider>
</NextIntlClientProvider>
</ThemeProvider>

Some files were not shown because too many files have changed in this diff Show More