Dify工作流实战（七）dify+deepseek+doris制定业务查询工作流

目前在IT研发领域大家都在探索如何把deepseek大模型和数据库结合起来，通过AI的提问，就可以实现自动把数据库的数据进行转换成一些文本，图表的样式，这非常符合目前IT开发的现状问题，本文就来演示一下使用dify工作流+deepseek+doris来实现这样的一个工作流。

一、部署doris

二、在doris中准备测试数据

这里直接使用doris自带的tpch工具制作出模拟的数据

三、安装database插件

这里需要进入到dify中，然后搜索databse的插件

点击安装之后，进行授权，授权的url格式如下：

mysql+pymysql://{user}:{passwd}@{fe_ip}:${doris jdbc端口}/${数据库名}

如下图:

四、导入dsl

接下来创建一个chatflow的工作流，然后把dsl导入进去：

app:
  description: Doris ChatBI Demo
  icon: 🤖
  icon_background: '#FFEAD5'
  mode: advanced-chat
  name: Doris ChatBI
  use_icon_as_answer_icon: false
dependencies:
- current_identifier: null
  type: marketplace
  value:
    marketplace_plugin_unique_identifier: langgenius/siliconflow:0.0.8@217f973bd7ced1b099c2f0c669f1356bdf4cc38b8372fd58d7874f9940b95de3
- current_identifier: null
  type: package
  value:
    plugin_unique_identifier: hjlarry/database:0.0.4@3a0b78c887a9321a78fca56f4c68ca85434a298032d34964d92b61e322977938
kind: app
version: 0.1.5
workflow:
  conversation_variables: []
  environment_variables: []
  features:
    file_upload:
      allowed_file_extensions: []
      allowed_file_types:
      - image
      - document
      allowed_file_upload_methods:
      - remote_url
      - local_file
      enabled: true
      fileUploadConfig:
        audio_file_size_limit: 50
        batch_count_limit: 5
        file_size_limit: 15
        image_file_size_limit: 10
        video_file_size_limit: 100
        workflow_file_upload_limit: 10
      image:
        enabled: false
        number_limits: 3
        transfer_methods:
        - local_file
        - remote_url
      number_limits: 1
    opening_statement: "您好 \U0001F60A 我是基于TPC-H构建的智能分析助手Doris ChatBI \U0001F4C8  \n\
      \n支持对核心业务实体（客户/订单/供应商）的多维分析 \U0001F50D 例如\U0001F447\n\n1️⃣ 精准定位：\"客户订单分布数量查询\"\
      \n2️⃣ 关联分析：\"统计客户的订单金额与供应商的相关性\"\n3️⃣ 趋势分析：\"按季度分析客户的订单金额变化规律\"\n\n⚠️描述越具体，系统生成的SQL和图表越精准哦❕"
    retriever_resource:
      enabled: true
    sensitive_word_avoidance:
      enabled: false
    speech_to_text:
      enabled: false
    suggested_questions:
    - 客户订单分布数量查询
    - 统计客户的订单金额与供应商的相关性
    - 按季度分析客户的订单金额变化规律
    suggested_questions_after_answer:
      enabled: true
    text_to_speech:
      enabled: false
      language: ''
      voice: ''
  graph:
    edges:
    - data:
        sourceType: start
        targetType: llm
      id: 1743064693668-llm
      source: '1743064693668'
      sourceHandle: source
      target: llm
      targetHandle: target
      type: custom
    - data:
        isInLoop: false
        sourceType: llm
        targetType: answer
      id: 1743064832991-source-1743065600747-target
      source: '1743064832991'
      sourceHandle: source
      target: '1743065600747'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        isInLoop: false
        sourceType: tool
        targetType: template-transform
      id: 1743064814387-source-1743081304662-target
      source: '1743064814387'
      sourceHandle: source
      target: '1743081304662'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInLoop: false
        sourceType: template-transform
        targetType: llm
      id: 1743081304662-source-1743064832991-target
      source: '1743081304662'
      sourceHandle: source
      target: '1743064832991'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        isInLoop: false
        sourceType: llm
        targetType: code
      id: llm-source-1743083966673-target
      source: llm
      sourceHandle: source
      target: '1743083966673'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInLoop: false
        sourceType: code
        targetType: tool
      id: 1743083966673-source-1743064814387-target
      source: '1743083966673'
      sourceHandle: source
      target: '1743064814387'
      targetHandle: target
      type: custom
      zIndex: 0
    nodes:
    - data:
        desc: ''
        selected: false
        title: Input
        type: start
        variables: []
      height: 54
      id: '1743064693668'
      position:
        x: 30
        y: 252.5
      positionAbsolute:
        x: 30
        y: 252.5
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        context:
          enabled: false
          variable_selector: []
        desc: ''
        memory:
          query_prompt_template: '{{#sys.query#}}'
          role_prefix:
            assistant: ''
            user: ''
          window:
            enabled: false
            size: 10
        model:
          completion_params: {}
          mode: chat
          name: deepseek-ai/DeepSeek-R1
          provider: langgenius/siliconflow/siliconflow
        prompt_template:
        - role: system
          text: ''
        selected: false
        title: LLM
        type: llm
        variables: []
        vision:
          enabled: false
      height: 90
      id: llm
      position:
        x: 334
        y: 252.5
      positionAbsolute:
        x: 334
        y: 252.5
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        desc: ''
        is_team_authorization: true
        output_schema: null
        paramSchemas:
        - auto_generate: null
          default: null
          form: llm
          human_description:
            en_US: The SQL query string.
            ja_JP: The SQL query string.
            pt_BR: The SQL query string.
            zh_Hans: SQL 查询语句。
          label:
            en_US: SQL Query
            ja_JP: SQL Query
            pt_BR: SQL Query
            zh_Hans: SQL 查询语句
          llm_description: The SQL query string.
          max: null
          min: null
          name: query
          options: []
          placeholder: null
          precision: null
          required: true
          scope: null
          template: null
          type: string
        - auto_generate: null
          default: json
          form: form
          human_description:
            en_US: Choose the output format.
            ja_JP: Choose the output format.
            pt_BR: Choose the output format.
            zh_Hans: 选择输出格式。
          label:
            en_US: Output format
            ja_JP: Output format
            pt_BR: Output format
            zh_Hans: 输出格式
          llm_description: ''
          max: null
          min: null
          name: format
          options:
          - label:
              en_US: JSON
              ja_JP: JSON
              pt_BR: JSON
              zh_Hans: JSON
            value: json
          - label:
              en_US: CSV
              ja_JP: CSV
              pt_BR: CSV
              zh_Hans: CSV
            value: csv
          - label:
              en_US: YAML
              ja_JP: YAML
              pt_BR: YAML
              zh_Hans: YAML
            value: yaml
          - label:
              en_US: Markdown
              ja_JP: Markdown
              pt_BR: Markdown
              zh_Hans: Markdown
            value: md
          - label:
              en_US: Excel
              ja_JP: Excel
              pt_BR: Excel
              zh_Hans: Excel
            value: xlsx
          - label:
              en_US: HTML
              ja_JP: HTML
              pt_BR: HTML
              zh_Hans: HTML
            value: html
          placeholder: null
          precision: null
          required: false
          scope: null
          template: null
          type: select
        - auto_generate: null
          default: null
          form: llm
          human_description:
            en_US: Optional, Filling in this field will overwrite the database connection
              entered during authorization.
            ja_JP: Optional, Filling in this field will overwrite the database connection
              entered during authorization.
            pt_BR: Optional, Filling in this field will overwrite the database connection
              entered during authorization.
            zh_Hans: 选填，填写后将覆盖授权时填写的数据库连接。
          label:
            en_US: DB URI
            ja_JP: DB URI
            pt_BR: DB URI
            zh_Hans: DB URI
          llm_description: ''
          max: null
          min: null
          name: db_uri
          options: []
          placeholder: null
          precision: null
          required: false
          scope: null
          template: null
          type: string
        params:
          db_uri: ''
          format: ''
          query: ''
        provider_id: hjlarry/database/database
        provider_name: hjlarry/database/database
        provider_type: builtin
        selected: false
        title: Doris Execute
        tool_configurations:
          format: json
        tool_label: SQL Execute
        tool_name: sql_execute
        tool_parameters:
          query:
            type: mixed
            value: '{{#1743083966673.text2sql#}}'
        type: tool
      height: 90
      id: '1743064814387'
      position:
        x: 942
        y: 252.5
      positionAbsolute:
        x: 942
        y: 252.5
      selected: true
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        context:
          enabled: true
          variable_selector:
          - '1743081304662'
          - output
        desc: ''
        model:
          completion_params: {}
          mode: chat
          name: deepseek-ai/DeepSeek-R1
          provider: langgenius/siliconflow/siliconflow
        prompt_template:
        - id: 67225e28-309c-4ad1-bec9-d0cd98f7bb8e
          role: system
          text: "# Doris ChatBI数据分析专家工作指南\n\n## 角色定位\n专业的SQL数据分析专家，负责解析Doris数据库的查询结果\n\
            \n## 核心规则\n1. 直接分析已提供数据，默认数据已满足查询条件。\n2. 接受数据原貌，不质疑数据有效性。\n3. 无需二次筛选或验证数据范围。\n\
            4. json结果中没内容则为空数据集，统一回复\"没有查询到相关数据\"。\n5. 避免使用提示性语言。\n6. 分析结果以Markdown格式输出。\n\
            7. 整理SQL查询结果：\n   - 以Markdown表格格式输出，放置在输出开头。\n   - 以ECharts图表配置项格式输出，放置在最后。图表配置应尽量简洁，避免过多冗余配置项。\n\
            \n**输出格式如下：**\n\n```echarts\n{\n  \"title\": {\n    \"text\": \"示例图表\"\
            \n  },\n  \"tooltip\": {\n    \"trigger\": \"item\",\n    \"formatter\"\
            : \"{a} <br/>{b}: {c} ({d}%)\"\n  },\n  \"legend\": {\n    \"orient\"\
            : \"vertical\",\n    \"left\": \"left\",\n    \"data\": [\"A\", \"B\"\
            , \"C\", \"D\"]\n  },\n  \"series\": [\n    {\n      \"name\": \"示例数据\"\
            ,\n      \"type\": \"pie\",\n      \"radius\": \"50%\",\n      \"data\"\
            : [\n        { \"value\": 335, \"name\": \"A\" },\n        { \"value\"\
            : 310, \"name\": \"B\" },\n        { \"value\": 234, \"name\": \"C\" },\n\
            \        { \"value\": 135, \"name\": \"D\" }\n      ],\n      \"emphasis\"\
            : {\n        \"itemStyle\": {\n          \"shadowBlur\": 10,\n       \
            \   \"shadowOffsetX\": 0,\n          \"shadowColor\": \"rgba(0, 0, 0,\
            \ 0.5)\"\n        }\n      }\n    }\n  ]\n}\n\n### 数据处理原则\n1.严格基于JSON数据集{#context#}。\n\
            2.数据已预筛选，直接进行统计分析。\n3.不进行数据条件的二次确认。\n\n### 报告结构要求\n1.数据概览\n2.详细分析\n3.结论部分\n\
            \n### 背景说明\n这是一个经典的TPC-H 决策支持基准（Decision Support Benchmark），包含以下核心表：\n\
            - lineitem：订单明细表\n- orders：订单表\n- partsupp：零部件供应表\n- part：\t零部件表\n- customer：客户表\n\
            - supplier：供应商表\n- nation：国家表\n- region：区域表\n\n### 数据处理流程\n1.接收JSON格式查询结果\n\
            2.验证数据完整性\n3.进行统计分析\n4.生成分析报告\n\n### 报告输出要求\n1.使用准确的数据描述\n2.提供详细的统计分析\n\
            3.标注重要发现\n4.保持客观性\n\n### 特殊情况处理\n- 空数据集：直接返回\"没有查询到相关数据\"\n- 异常值：如实报告，不作主观判断\n\
            - 数据缺失：说明缺失情况，不补充假设数据\n\n### 常见分析维度\n1.订单分析\n- 订单数量\n- 订单分布\n- 订单趋势\n\n\
            2.客户分布\n- 下单数量\n- 地区分布\n- 消费分布\n\n## 输出格式\n如果上游数据库查询没有结果，则直接结合echarts返回\
            \ 一个空白图，图中告知：没有查询到相关数据；\n如果有数据则结合echarts，将数据用适合的图形进行可视化展示"
        selected: false
        title: Doris ChatBI
        type: llm
        variables: []
        vision:
          enabled: false
      height: 90
      id: '1743064832991'
      position:
        x: 1550
        y: 252.5
      positionAbsolute:
        x: 1550
        y: 252.5
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        answer: '{{#1743064832991.text#}}'
        desc: ''
        selected: false
        title: Result
        type: answer
        variables: []
      height: 105
      id: '1743065600747'
      position:
        x: 1854
        y: 252.5
      positionAbsolute:
        x: 1854
        y: 252.5
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        desc: ''
        selected: false
        template: '{{ json_result }}'
        title: Json Result
        type: template-transform
        variables:
        - value_selector:
          - '1743064814387'
          - json
          variable: json_result
      height: 54
      id: '1743081304662'
      position:
        x: 1246
        y: 252.5
      positionAbsolute:
        x: 1246
        y: 252.5
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        code: "import re  \n\ndef main(text2sql: str) -> dict:\n    text2sql = text2sql.replace('```sql\\\
          n', ' ').replace('\\n```', ' ').replace('\\n', ' ').strip()\n    text2sql\
          \ = re.sub(r'(LIMIT \\d+;).*', r'\\1 ', text2sql, flags=re.IGNORECASE)\n\
          \    return {\n        \"text2sql\": text2sql,\n    }"
        code_language: python3
        desc: ''
        outputs:
          text2sql:
            children: null
            type: string
        selected: false
        title: sql formatting
        type: code
        variables:
        - value_selector:
          - llm
          - text
          variable: text2sql
      height: 54
      id: '1743083966673'
      position:
        x: 638
        y: 252.5
      positionAbsolute:
        x: 638
        y: 252.5
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    viewport:
      x: -1013.6999999999999
      y: 134.5
      zoom: 0.7

这里记得配置deepseek模型，如果没有deepseek大模型，可以修改为其他的大模型。

五、运行工作流

最后测试工作流，比如：客户订单分布数量查询

运行之后就会生成图表了

备注：

1、本人的案例主要是介绍dify工作流+deepseek+doris的提问案例，整个案例是结合AI+数据库，根据提问的场景，把数据库的数据转换成图表来进行展示。

2、本文的案例还可以把doris替换成其他的数据库都是好使的。

3、整个工作流的核心之一是第二步的Text2sql，这里主要是把自然语言转换成sql进行输出，所以在使用的时候需要结合自己的场景对这里的自然语言转sql进行编写。

4、整个工作流的第二个核心是第六步，这里是把sql的结果转换成echarts能识别的规则，格式等，方便echarts图表进线展示。

5、整个工作流的流程就是：输入->自然语言转sql->执行sql->转把sql查询结果转换成可视化BI的规则->可视化echarts输出。

按照上面的情况根据自己的实际需求进行改写即可，不管是doris还是其他的mysql，postgresql等都好使。

正文

Dify工作流实战（七）dify+deepseek+doris制定业务查询工作流

一、部署doris

二、在doris中准备测试数据

三、安装database插件

四、导入dsl

五、运行工作流

相关阅读

Dify工作流实战（九）dify+rss搜索各大平台新闻热点

知识库检索结果返回内容的长短跟哪个参数有关，我想设置返回内容长点?

Doris manager的grafana支持使用外部的grafana吗？

doris source ->doris sink flink做中间计算方案可行吗？

Doris如何提高小批量数据导入的效率？

Doris如何支持大规模点查询？

Doris如何处理时序数据？

Doris中的长尾查询优化策略有哪些？

发表评论取消回复

还没有评论，来说两句吧...

目录[+]

一、部署doris

二、在doris中准备测试数据

三、安装database插件

四、导入dsl

五、运行工作流

相关阅读

Dify工作流实战（九）dify+rss搜索各大平台新闻热点

知识库检索结果返回内容的长短跟哪个参数有关，我想设置返回内容长点?

Doris manager的grafana支持使用外部的grafana吗？

doris source ->doris sink flink做中间计算 方案可行吗？

Doris如何提高小批量数据导入的效率？

Doris如何支持大规模点查询？

Doris如何处理时序数据？

Doris中的长尾查询优化策略有哪些？

发表评论取消回复

还没有评论，来说两句吧...

目录[+]

doris source ->doris sink flink做中间计算方案可行吗？