本文是一篇实操性教程,指导开发者如何构建一个Chrome扩展程序,以增强AI工具使用的专注度和意图性,涵盖设计思路、技术选型与实现步骤。 This is a hands-on tutorial guiding developers through building a Chrome extension designed to promote intentional, focused use of AI tools—covering design rationale, tech stack selection, and step-by-step implementation.
AutoGPT 是一个开源的自主AI代理框架,旨在降低AI应用与开发门槛,支持用户基于其构建定制化智能体。 AutoGPT is an open-source autonomous AI agent framework designed to democratize AI usage and development, enabling users to build custom agents on top of it.
Hugging Face Transformers 是一个广泛使用的开源库,支持文本、视觉、音频及多模态模型的定义、训练与推理,是AI开发工作流中的核心基础设施。 Hugging Face Transformers is a widely adopted open-source library enabling model definition, training, and inference for state-of-the-art text, vision, audio, and multimodal models — serving as foundational infrastructure for AI development workflows.
NousResearch/hermes-agent 是一个开源的智能体(agent)框架,旨在随用户需求演进,支持可扩展的任务自动化与多步推理。该项目提供代码、配置示例和基础文档,但尚未包含完整教程或生产级部署指南。 NousResearch/hermes-agent is an open-source agent framework designed to evolve with user needs, supporting scalable task automation and multi-step reasoning. It provides source code, configuration examples, and basic documentation, but lacks comprehensive tutorials or production-ready deployment guidance.
f/prompts.chat 是一个开源的社区驱动型提示词共享平台,前身是 Awesome ChatGPT Prompts,支持自托管以保障数据隐私。 f/prompts.chat is an open-source, community-driven prompt repository (formerly Awesome ChatGPT Prompts) that enables sharing, discovering, and self-hosting prompts with full privacy control.
LangChain 是一个开源的、用于构建基于大语言模型应用的框架,专注于代理(Agent)工程、链式调用与工具集成。它为开发者提供了标准化的抽象层和可复用模块。 LangChain is an open-source framework for building LLM-powered applications, focused on agent engineering, chaining, and tool integration, providing standardized abstractions and reusable components for developers.
Dify 是一个面向生产环境的开源平台,专为构建和部署基于智能体(agentic)的工作流而设计,支持可视化编排、模型集成与应用发布。 Dify is a production-ready open-source platform for developing and deploying agentic workflows, featuring visual orchestration, LLM integration, and application publishing.
本文记录了Fable 5因政府指令突然停服后,作者在备用系统上运行关键AI工作流时遭遇的故障与应对反思,聚焦AI服务稳定性与运维韧性问题。 This post documents the author's experience running a critical AI workflow on a backup system after Fable 5 went offline due to a government directive, highlighting real-world reliability challenges and operational trade-offs in production AI usage.
Open WebUI 是一个开源的、用户友好的本地化 AI 界面,支持 Ollama、OpenAI API 等多种后端模型,便于快速部署和交互式使用。 Open WebUI is an open-source, user-friendly local AI interface that supports multiple backends including Ollama and OpenAI API, enabling quick deployment and interactive use.
这是一则 Hacker News 上的社区讨论帖,询问开发者是否已将 Claude/GPT 等云端大模型完全替换为本地运行的 AI 模型用于日常编程工作,并征集实际部署配置与性能反馈。 This is a Hacker News community discussion post asking whether developers have fully replaced cloud-based LLMs like Claude/GPT with locally run models for daily coding tasks, and soliciting real-world setup details and performance metrics (e.g., tokens/sec).
本文作为系列终篇,提出核心观点:AI不应被盲目信任,而应被主动设计;强调通过知识图谱、自动审查、自修复和复发预防等机制构建可信赖的AI系统,而非依赖其生成能力本身。 This final installment of a series argues that AI should not be trusted by default but deliberately designed—highlighting four engineering mechanisms (knowledge graph, Auto Review, Self-Healing, Recurrence Prevention) and a non-engineer-facing PR layer to embed accountability and contextual awareness into AI systems.
llama.cpp 是一个在 C/C++ 中实现的轻量级、高性能大语言模型推理框架,支持在 CPU 和边缘设备上高效运行量化模型。 llama.cpp is a lightweight, high-performance LLM inference framework implemented in C/C++, enabling efficient quantized model execution on CPUs and edge devices.
Nemotron 3 Ultra 是一款开源的混合专家(MoE)大模型,融合 Mamba 与 Transformer 架构,具备 5500 亿总参数和 550 亿激活参数,支持 100 万 token 上下文,并采用 LatentMoE 和多 Token 预测等前沿技术优化智能体推理能力。 Nemotron 3 Ultra is an open-source Mixture-of-Experts language model that hybridizes Mamba and Transformer architectures, featuring 550B total / 55B active parameters, 1M-token context length, and advanced techniques like LatentMoE and Multi-Token Prediction for agentic reasoning.
本文分析了Google Gemini API账单中模型名称与实际计费项不一致的现象,揭示了底层模型映射、版本切换和定价分层等隐藏机制。 This article explains why Gemini API billing line items often don’t match advertised model names—uncovering hidden model aliasing, version routing, and tiered pricing logic behind Google’s billing system.
Langflow 是一个开源的低代码可视化平台,用于构建、调试和部署基于大语言模型的AI智能体与工作流。 Langflow is an open-source, low-code visual platform for building, debugging, and deploying LLM-powered AI agents and workflows.
Firecrawl 是一个开源的 Web 数据获取工具,提供可编程 API,支持大规模网页搜索、爬取和交互,专为 AI 应用(如 RAG)优化。 Firecrawl is an open-source web data acquisition tool offering a programmable API for large-scale web search, scraping, and interaction—designed specifically to power AI applications like RAG.
本文是一篇实操性教程,介绍如何微调Google的Gemma模型(文中误称‘Gemma 4’,实际Gemma最新公开版本为Gemma 2)以实现古韩文(如朝鲜王朝时期文献)的翻译任务。 This is a hands-on tutorial demonstrating how to fine-tune Google's Gemma model (note: 'Gemma 4' is inaccurate—Gemma 2 is the latest public version) for translating classical Korean texts, such as Joseon-era documents.
本文探讨欧洲是否具备自主训练前沿AI模型所需的算力基础设施,聚焦地缘政治、产业能力与技术主权等宏观议题。 This piece examines whether Europe possesses sufficient domestically owned compute infrastructure to train frontier AI models, addressing geopolitical, industrial, and technological sovereignty concerns.
本文提出Tangram方法,通过非均匀KV缓存压缩技术优化多轮对话场景下的大语言模型服务效率,在保持高精度的同时缓解内存瓶颈。该方法挑战了现有推理框架对各注意力头KV长度一致性的隐含假设。 This paper introduces Tangram, a non-uniform KV cache compression method that significantly improves memory efficiency for multi-turn LLM serving while preserving accuracy—by breaking the common assumption of uniform KV lengths across attention heads in modern inference systems.
本文提出了一种端到端的‘数据记者智能体’(Data Journalist Agent),旨在整合数据分析、叙事构建与多模态可视化能力,实现从原始数据到可验证新闻故事的自动化生成。 This paper introduces the Data Journalist Agent, an end-to-end AI agent that integrates data analysis, narrative reasoning, and multimodal (text + visual) storytelling to transform raw data into verifiable, journalist-grade news features.