refactor(asset): move database extension initialization to migrations

- Remove pg_trgm and pg_ivm extension setup from AssetConfig.ready() method - Move extension creation to migration 0002 using RunSQL operations - Add pg_trgm extension creation for text search index support - Add pg_ivm extension creation for IMMV incremental maintenance - Generate unique cursor names in search_service to prevent concurrent request conflicts - Add @transaction.non_atomic_requests decorator to export view for server-side cursor compatibility - Simplify app initialization by delegating extension setup to database migrations - Improve thread safety and concurrency handling for streaming exports
chore: bump version to v1.3.7-dev
2026-01-31 19:53:11 +08:00 · 2026-01-03 18:20:27 +08:00 · 2026-01-03 09:56:39 +00:00 · 2026-01-03 17:35:54 +08:00 · 2026-01-03 17:26:34 +08:00 · 2026-01-03 08:44:06 +00:00
10 changed files with 127 additions and 174 deletions
--- a/.github/workflows/docker-build.yml
+++ b/.github/workflows/docker-build.yml
@@ -44,6 +44,10 @@ jobs:
            dockerfile: docker/agent/Dockerfile
            context: .
            platforms: linux/amd64,linux/arm64
+          - image: xingrin-postgres
+            dockerfile: docker/postgres/Dockerfile
+            context: docker/postgres
+            platforms: linux/amd64,linux/arm64

    steps:
      - name: Checkout
@@ -106,8 +110,8 @@ jobs:
            ${{ steps.version.outputs.IS_RELEASE == 'true' && format('{0}/{1}:latest', env.IMAGE_PREFIX, matrix.image) || '' }}
          build-args: |
            IMAGE_TAG=${{ steps.version.outputs.VERSION }}
-          cache-from: type=gha,scope=${{ matrix.image }}
-          cache-to: type=gha,mode=max,scope=${{ matrix.image }}
+          cache-from: type=registry,ref=${{ env.IMAGE_PREFIX }}/${{ matrix.image }}:cache
+          cache-to: type=registry,ref=${{ env.IMAGE_PREFIX }}/${{ matrix.image }}:cache,mode=max
          provenance: false
          sbom: false

--- a/README.md
+++ b/README.md
@@ -13,14 +13,14 @@

 <p align="center">
  <a href="#-功能特性">功能特性</a> •
+  <a href="#-全局资产搜索">资产搜索</a> •
  <a href="#-快速开始">快速开始</a> •
  <a href="#-文档">文档</a> •
-  <a href="#-技术栈">技术栈</a> •
  <a href="#-反馈与贡献">反馈与贡献</a>
 </p>

 <p align="center">
-  <sub>🔍 关键词: ASM | 攻击面管理 | 漏洞扫描 | 资产发现 | Bug Bounty | 渗透测试 | Nuclei | 子域名枚举 | EASM</sub>
+  <sub>🔍 关键词: ASM | 攻击面管理 | 漏洞扫描 | 资产发现 | 资产搜索 | Bug Bounty | 渗透测试 | Nuclei | 子域名枚举 | EASM</sub>
 </p>

 ---
@@ -162,9 +162,34 @@ flowchart TB
    W3 -.心跳上报.-> REDIS
 ```

+### 🔎 全局资产搜索
+- **多类型搜索** - 支持 Website 和 Endpoint 两种资产类型
+- **表达式语法** - 支持 `=`（模糊）、`==`（精确）、`!=`（不等于）操作符
+- **逻辑组合** - 支持 `&&` (AND) 和 `||` (OR) 逻辑组合
+- **多字段查询** - 支持 host、url、title、tech、status、body、header 字段
+- **CSV 导出** - 流式导出全部搜索结果，无数量限制
+
+#### 搜索语法示例
+
+```bash
+# 基础搜索
+host="api"                    # host 包含 "api"
+status=="200"                 # 状态码精确等于 200
+tech="nginx"                  # 技术栈包含 nginx
+
+# 组合搜索
+host="api" && status=="200"   # host 包含 api 且状态码为 200
+tech="vue" || tech="react"    # 技术栈包含 vue 或 react
+
+# 复杂查询
+host="admin" && tech="php" && status=="200"
+url="/api/v1" && status!="404"
+```
+
 ### 📊 可视化界面
 - **数据统计** - 资产/漏洞统计仪表盘
 - **实时通知** - WebSocket 消息推送
+- **通知推送** - 实时企业微信，tg，discard消息推送服务

 ---

@@ -172,7 +197,7 @@ flowchart TB

 ### 环境要求

- **操作系统**: Ubuntu 20.04+ / Debian 11+ （推荐）
+- **操作系统**: Ubuntu 20.04+ / Debian 11+ 
 - **硬件**: 2核 4G 内存起步，20GB+ 磁盘空间

 ### 一键安装
@@ -197,6 +222,7 @@ sudo ./install.sh --mirror
 ### 访问服务

 - **Web 界面**: `https://ip:8083` 
+- **默认账号**: admin / admin（首次登录后请修改密码）

 ### 常用命令

--- a/2
+++ b/2
@@ -1 +1 @@
-v1.3.2-dev
+v1.3.7-dev
--- a/backend/apps/asset/apps.py
+++ b/backend/apps/asset/apps.py
@@ -1,106 +1,6 @@
-import logging
-import sys
-
 from django.apps import AppConfig

-logger = logging.getLogger(__name__)
-

 class AssetConfig(AppConfig):
    default_auto_field = 'django.db.models.BigAutoField'
    name = 'apps.asset'
-    
-    def ready(self):
-        # 导入所有模型以确保Django发现并注册
-        from . import models
-        
-        # 启用 pg_trgm 扩展（用于文本模糊搜索索引）
-        # 用于已有数据库升级场景
-        self._ensure_pg_trgm_extension()
-        
-        # 验证 pg_ivm 扩展是否可用（用于 IMMV 增量维护）
-        self._verify_pg_ivm_extension()
-    
-    def _ensure_pg_trgm_extension(self):
-        """
-        确保 pg_trgm 扩展已启用。
-        该扩展用于 response_body 和 response_headers 字段的 GIN 索引，
-        支持高效的文本模糊搜索。
-        """
-        from django.db import connection
-        
-        # 检查是否为 PostgreSQL 数据库
-        if connection.vendor != 'postgresql':
-            logger.debug("跳过 pg_trgm 扩展：当前数据库不是 PostgreSQL")
-            return
-        
-        try:
-            with connection.cursor() as cursor:
-                cursor.execute("CREATE EXTENSION IF NOT EXISTS pg_trgm;")
-                logger.debug("pg_trgm 扩展已启用")
-        except Exception as e:
-            # 记录错误但不阻止应用启动
-            # 常见原因：权限不足（需要超级用户权限）
-            logger.warning(
-                "无法创建 pg_trgm 扩展: %s。"
-                "这可能导致 response_body 和 response_headers 字段的 GIN 索引无法正常工作。"
-                "请手动执行: CREATE EXTENSION IF NOT EXISTS pg_trgm;",
-                str(e)
-            )
-    
-    def _verify_pg_ivm_extension(self):
-        """
-        验证 pg_ivm 扩展是否可用。
-        pg_ivm 用于 IMMV（增量维护物化视图），是系统必需的扩展。
-        如果不可用，将记录错误并退出。
-        """
-        from django.db import connection
-        
-        # 检查是否为 PostgreSQL 数据库
-        if connection.vendor != 'postgresql':
-            logger.debug("跳过 pg_ivm 验证：当前数据库不是 PostgreSQL")
-            return
-        
-        # 跳过某些管理命令（如 migrate、makemigrations）
-        import sys
-        if len(sys.argv) > 1 and sys.argv[1] in ('migrate', 'makemigrations', 'collectstatic', 'check'):
-            logger.debug("跳过 pg_ivm 验证：当前为管理命令")
-            return
-        
-        try:
-            with connection.cursor() as cursor:
-                # 检查 pg_ivm 扩展是否已安装
-                cursor.execute("""
-                    SELECT COUNT(*) FROM pg_extension WHERE extname = 'pg_ivm'
-                """)
-                count = cursor.fetchone()[0]
-                
-                if count > 0:
-                    logger.info("✓ pg_ivm 扩展已启用")
-                else:
-                    # 尝试创建扩展
-                    try:
-                        cursor.execute("CREATE EXTENSION IF NOT EXISTS pg_ivm;")
-                        logger.info("✓ pg_ivm 扩展已创建并启用")
-                    except Exception as create_error:
-                        logger.error(
-                            "=" * 60 + "\n"
-                            "错误: pg_ivm 扩展未安装\n"
-                            "=" * 60 + "\n"
-                            "pg_ivm 是系统必需的扩展，用于增量维护物化视图。\n\n"
-                            "请在 PostgreSQL 服务器上安装 pg_ivm：\n"
-                            "  curl -sSL https://raw.githubusercontent.com/yyhuni/xingrin/main/docker/scripts/install-pg-ivm.sh | sudo bash\n\n"
-                            "或手动安装：\n"
-                            "  1. apt install build-essential postgresql-server-dev-15 git\n"
-                            "  2. git clone https://github.com/sraoss/pg_ivm.git && cd pg_ivm && make && make install\n"
-                            "  3. 在 postgresql.conf 中添加: shared_preload_libraries = 'pg_ivm'\n"
-                            "  4. 重启 PostgreSQL\n"
-                            "=" * 60
-                        )
-                        # 在生产环境中退出，开发环境中仅警告
-                        from django.conf import settings
-                        if not settings.DEBUG:
-                            sys.exit(1)
-                        
-        except Exception as e:
-            logger.error(f"pg_ivm 扩展验证失败: {e}")
--- a/backend/apps/asset/migrations/0002_create_search_materialized_view.py
+++ b/backend/apps/asset/migrations/0002_create_search_materialized_view.py
@@ -18,7 +18,13 @@ class Migration(migrations.Migration):
    ]

    operations = [
-        # 1. 确保 pg_ivm 扩展已启用
+        # 1. 确保 pg_trgm 扩展已启用（用于文本模糊搜索索引）
+        migrations.RunSQL(
+            sql="CREATE EXTENSION IF NOT EXISTS pg_trgm;",
+            reverse_sql="-- pg_trgm extension kept for other uses"
+        ),
+        
+        # 2. 确保 pg_ivm 扩展已启用（用于 IMMV 增量维护）
        migrations.RunSQL(
            sql="CREATE EXTENSION IF NOT EXISTS pg_ivm;",
            reverse_sql="-- pg_ivm extension kept for other uses"
--- a/backend/apps/asset/services/search_service.py
+++ b/backend/apps/asset/services/search_service.py
@@ -11,7 +11,8 @@

 import logging
 import re
-from typing import Optional, List, Dict, Any, Tuple, Literal
+import uuid
+from typing import Optional, List, Dict, Any, Tuple, Literal, Iterator

 from django.db import connection

@@ -394,3 +395,49 @@ class AssetSearchService:
        except Exception as e:
            logger.error(f"统计查询失败: {e}")
            raise
+    
+    def search_iter(
+        self, 
+        query: str, 
+        asset_type: AssetType = 'website',
+        batch_size: int = 1000
+    ) -> Iterator[Dict[str, Any]]:
+        """
+        流式搜索资产（使用服务端游标，内存友好）
+        
+        Args:
+            query: 搜索查询字符串
+            asset_type: 资产类型 ('website' 或 'endpoint')
+            batch_size: 每批获取的数量
+        
+        Yields:
+            Dict: 单条搜索结果
+        """
+        where_clause, params = SearchQueryParser.parse(query)
+        
+        # 根据资产类型选择视图和字段
+        view_name = VIEW_MAPPING.get(asset_type, 'asset_search_view')
+        select_fields = ENDPOINT_SELECT_FIELDS if asset_type == 'endpoint' else WEBSITE_SELECT_FIELDS
+        
+        sql = f"""
+            SELECT {select_fields}
+            FROM {view_name}
+            WHERE {where_clause}
+            ORDER BY created_at DESC
+        """
+        
+        # 生成唯一的游标名称，避免并发请求冲突
+        cursor_name = f'export_cursor_{uuid.uuid4().hex[:8]}'
+        
+        try:
+            # 使用服务端游标，避免一次性加载所有数据到内存
+            with connection.cursor(name=cursor_name) as cursor:
+                cursor.itersize = batch_size
+                cursor.execute(sql, params)
+                columns = [col[0] for col in cursor.description]
+                
+                for row in cursor:
+                    yield dict(zip(columns, row))
+        except Exception as e:
+            logger.error(f"流式搜索查询失败: {e}, SQL: {sql}, params: {params}")
+            raise
--- a/backend/apps/asset/views/search_views.py
+++ b/backend/apps/asset/views/search_views.py
@@ -28,15 +28,13 @@

 import logging
 import json
-import csv
-from io import StringIO
 from datetime import datetime
 from urllib.parse import urlparse, urlunparse
 from rest_framework import status
 from rest_framework.views import APIView
 from rest_framework.request import Request
 from django.http import StreamingHttpResponse
-from django.db import connection
+from django.db import connection, transaction

 from apps.common.response_helpers import success_response, error_response
 from apps.common.error_codes import ErrorCodes
@@ -287,76 +285,41 @@ class AssetSearchExportView(APIView):
        asset_type: 资产类型 ('website' 或 'endpoint'，默认 'website')
    
    Response:
-        CSV 文件流
-    """
+        CSV 文件流（使用服务端游标，支持大数据量导出）
    
-    # 导出数量限制
-    MAX_EXPORT_ROWS = 10000
+    注意：使用 @transaction.non_atomic_requests 装饰器，
+    因为服务端游标不能在事务块内使用。
+    """
    
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.service = AssetSearchService()
    
-    def _parse_headers(self, headers_data) -> str:
-        """解析响应头为字符串"""
-        if not headers_data:
-            return ''
-        try:
-            headers = json.loads(headers_data)
-            return '; '.join(f'{k}: {v}' for k, v in headers.items())
-        except (json.JSONDecodeError, TypeError):
-            return str(headers_data)
-    
-    def _generate_csv(self, results: list, asset_type: str):
-        """生成 CSV 内容的生成器"""
-        # 定义列
+    def _get_headers_and_formatters(self, asset_type: str):
+        """获取 CSV 表头和格式化器"""
+        from apps.common.utils import format_datetime, format_list_field
+        
        if asset_type == 'website':
-            columns = ['url', 'host', 'title', 'status_code', 'content_type', 'content_length', 
+            headers = ['url', 'host', 'title', 'status_code', 'content_type', 'content_length', 
                      'webserver', 'location', 'tech', 'vhost', 'created_at']
-            headers = ['URL', 'Host', 'Title', 'Status', 'Content-Type', 'Content-Length',
-                      'Webserver', 'Location', 'Technologies', 'VHost', 'Created At']
        else:
-            columns = ['url', 'host', 'title', 'status_code', 'content_type', 'content_length',
+            headers = ['url', 'host', 'title', 'status_code', 'content_type', 'content_length',
                      'webserver', 'location', 'tech', 'matched_gf_patterns', 'vhost', 'created_at']
-            headers = ['URL', 'Host', 'Title', 'Status', 'Content-Type', 'Content-Length',
-                      'Webserver', 'Location', 'Technologies', 'GF Patterns', 'VHost', 'Created At']
        
-        # 写入 BOM 和表头
-        output = StringIO()
-        writer = csv.writer(output)
+        formatters = {
+            'created_at': format_datetime,
+            'tech': lambda x: format_list_field(x, separator='; '),
+            'matched_gf_patterns': lambda x: format_list_field(x, separator='; '),
+            'vhost': lambda x: 'true' if x else ('false' if x is False else ''),
+        }
        
-        # UTF-8 BOM
-        yield '\ufeff'
-        
-        # 表头
-        writer.writerow(headers)
-        yield output.getvalue()
-        output.seek(0)
-        output.truncate(0)
-        
-        # 数据行
-        for result in results:
-            row = []
-            for col in columns:
-                value = result.get(col)
-                if col == 'tech' or col == 'matched_gf_patterns':
-                    # 数组转字符串
-                    row.append('; '.join(value) if value else '')
-                elif col == 'created_at':
-                    # 日期格式化
-                    row.append(value.strftime('%Y-%m-%d %H:%M:%S') if value else '')
-                elif col == 'vhost':
-                    row.append('true' if value else 'false' if value is False else '')
-                else:
-                    row.append(str(value) if value is not None else '')
-            
-            writer.writerow(row)
-            yield output.getvalue()
-            output.seek(0)
-            output.truncate(0)
+        return headers, formatters
    
+    @transaction.non_atomic_requests
    def get(self, request: Request):
-        """导出搜索结果为 CSV"""
+        """导出搜索结果为 CSV（流式导出，无数量限制）"""
+        from apps.common.utils import generate_csv_rows
+        
        # 获取搜索查询
        query = request.query_params.get('q', '').strip()
        
@@ -376,23 +339,28 @@ class AssetSearchExportView(APIView):
                status_code=status.HTTP_400_BAD_REQUEST
            )
        
-        # 获取搜索结果（限制数量）
-        results = self.service.search(query, asset_type, limit=self.MAX_EXPORT_ROWS)
-        
-        if not results:
+        # 检查是否有结果（快速检查，避免空导出）
+        total = self.service.count(query, asset_type)
+        if total == 0:
            return error_response(
                code=ErrorCodes.NOT_FOUND,
                message='No results to export',
                status_code=status.HTTP_404_NOT_FOUND
            )
        
+        # 获取表头和格式化器
+        headers, formatters = self._get_headers_and_formatters(asset_type)
+        
+        # 获取流式数据迭代器
+        data_iterator = self.service.search_iter(query, asset_type)
+        
        # 生成文件名
        timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
        filename = f'search_{asset_type}_{timestamp}.csv'
        
        # 返回流式响应
        response = StreamingHttpResponse(
-            self._generate_csv(results, asset_type),
+            generate_csv_rows(data_iterator, headers, formatters),
            content_type='text/csv; charset=utf-8'
        )
        response['Content-Disposition'] = f'attachment; filename="{filename}"'
--- a/backend/config/settings.py
+++ b/backend/config/settings.py
@@ -219,6 +219,8 @@ REST_FRAMEWORK = {
 # 允许所有来源（前后端分离项目，安全性由认证系统保障）
 CORS_ALLOW_ALL_ORIGINS = os.getenv('CORS_ALLOW_ALL_ORIGINS', 'True').lower() == 'true'
 CORS_ALLOW_CREDENTIALS = True
+# 暴露额外的响应头给前端（Content-Disposition 用于文件下载获取文件名）
+CORS_EXPOSE_HEADERS = ['Content-Disposition']

 # ==================== CSRF 配置 ====================
 CSRF_TRUSTED_ORIGINS = os.getenv('CSRF_TRUSTED_ORIGINS', 'http://localhost:3000,http://127.0.0.1:3000').split(',')
--- a/docker/docker-compose.dev.yml
+++ b/docker/docker-compose.dev.yml
@@ -8,7 +8,7 @@ services:
    build:
      context: ./postgres
      dockerfile: Dockerfile
-    image: ${DOCKER_USER:-yyhuni}/xingrin-postgres:15
+    image: ${DOCKER_USER:-yyhuni}/xingrin-postgres:${IMAGE_TAG:-dev}
    restart: always
    environment:
      POSTGRES_DB: ${DB_NAME}
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@@ -14,7 +14,7 @@ services:
    build:
      context: ./postgres
      dockerfile: Dockerfile
-    image: ${DOCKER_USER:-yyhuni}/xingrin-postgres:15
+    image: ${DOCKER_USER:-yyhuni}/xingrin-postgres:${IMAGE_TAG:?IMAGE_TAG is required}
    restart: always
    environment:
      POSTGRES_DB: ${DB_NAME}
Author	SHA1	Message	Date
yyhuni	674acdac66	refactor(asset): move database extension initialization to migrations - Remove pg_trgm and pg_ivm extension setup from AssetConfig.ready() method - Move extension creation to migration 0002 using RunSQL operations - Add pg_trgm extension creation for text search index support - Add pg_ivm extension creation for IMMV incremental maintenance - Generate unique cursor names in search_service to prevent concurrent request conflicts - Add @transaction.non_atomic_requests decorator to export view for server-side cursor compatibility - Simplify app initialization by delegating extension setup to database migrations - Improve thread safety and concurrency handling for streaming exports	2026-01-03 18:20:27 +08:00
github-actions[bot]	c59152bedf	chore: bump version to v1.3.7-dev	2026-01-03 09:56:39 +00:00
yyhuni	b4037202dc	feat: use registry cache for faster builds	2026-01-03 17:35:54 +08:00
yyhuni	4b4f9862bf	ci(docker): add postgres image build configuration and update image tags - Add xingrin-postgres image build job to docker-build workflow for multi-platform support (linux/amd64,linux/arm64) - Update docker-compose.dev.yml to use IMAGE_TAG variable with dev as default fallback - Update docker-compose.yml to use IMAGE_TAG variable with required validation - Replace hardcoded postgres image tag (15) with dynamic IMAGE_TAG for better version management - Enable flexible image tagging across development and production environments	2026-01-03 17:26:34 +08:00
github-actions[bot]	1c42e4978f	chore: bump version to v1.3.5-dev	2026-01-03 08:44:06 +00:00
github-actions[bot]	57bab63997	chore: bump version to v1.3.3-dev	2026-01-03 05:55:07 +00:00
github-actions[bot]	b1f0f18ac0	chore: bump version to v1.3.4-dev	2026-01-03 05:54:50 +00:00
yyhuni	ccee5471b8	docs(readme): add notification push service documentation - Add notification push service feature to visualization interface section - Document support for real-time WeChat Work, Telegram, and Discord message push - Enhance feature list clarity for notification capabilities	2026-01-03 13:34:36 +08:00
yyhuni	0ccd362535	优化下载逻辑	2026-01-03 13:32:58 +08:00