这是微软在 GitHub 上发布的面向初学者的生成式 AI 实践教程,包含 21 节结构化课程,涵盖基础概念、API 调用、提示工程及小型应用开发。 This is Microsoft's GitHub-hosted hands-on tutorial series for beginners in generative AI, featuring 21 structured lessons covering fundamentals, API integration, prompt engineering, and building small applications.
AutoGPT 是一个开源的自主AI代理框架,旨在降低AI应用与开发门槛,支持用户基于LLM构建可自主规划、执行和迭代的任务系统。 AutoGPT is an open-source autonomous AI agent framework designed to democratize AI usage and development, enabling users to build LLM-powered agents that plan, execute, and iterate tasks independently.
NousResearch/hermes-agent 是一个开源的、可扩展的智能体框架,旨在随用户需求演进,支持动态工具调用与上下文自适应。该项目提供代码仓库、配置示例和基础文档,便于开发者快速集成与定制。 NousResearch/hermes-agent is an open-source, extensible agent framework designed to evolve with user needs, supporting dynamic tool use and context-aware adaptation. The repository includes code, configuration examples, and foundational documentation for developer integration and customization.
本文探讨AI智能体系统中“查询失真”与“工具调用真实”之间的张力,指出仅依赖相关性评估不足以保障推理可靠性,并延续了自校正系统(Self-Correcting Systems)的前沿研究脉络。 This article examines the tension between 'query distortion' and 'tool-call truthfulness' in AI agent systems, arguing that relevance alone is insufficient for reliable reasoning—a continuation of cutting-edge research on self-correcting systems.
本文探讨了LLM智能体在生产环境中难以复现的根本原因,指出部分不确定性实为设计特性,并介绍了“记录与回放”(record-and-replay)这一实用调试方法,帮助开发者高效定位和修复问题。 This article explains why LLM agents are notoriously difficult to reproduce in production—highlighting that some nondeterminism is intentional and beneficial—and introduces record-and-replay as a practical, actionable debugging technique to capture and diagnose real-world agent failures.
Hugging Face Transformers 是一个广泛使用的开源库,提供数千种预训练模型和统一API,支持文本、视觉、音频及多模态任务的推理与训练。 Hugging Face Transformers is a widely adopted open-source library offering thousands of pre-trained models and a unified API for inference and training across text, vision, audio, and multimodal tasks.
本文介绍Gemma 4 12B——一款统一的、无需独立编码器的多模态大模型,强调其架构创新与跨模态能力。该模型尚未开源或商用,目前仅以技术概念形式在社区披露。 This post introduces Gemma 4 12B—a unified, encoder-free multimodal large language model—highlighting its architectural novelty and cross-modal capabilities. It is currently disclosed as a conceptual/technical announcement with no public release or implementation details.
f/prompts.chat 是一个开源的社区驱动型提示词(prompt)共享平台,前身是 Awesome ChatGPT Prompts,支持自托管以保障数据隐私。 f/prompts.chat is an open-source, community-driven prompt-sharing platform—formerly Awesome ChatGPT Prompts—that enables private, self-hosted deployment for organizations.
本文详细介绍了从零构建AI驱动会议平台Hoovik的技术实现,涵盖WebRTC信令、基于Redis的分布式Node.js架构及实时情绪分析等关键模块。 This article provides a detailed technical walkthrough of building the AI-powered meeting platform Hoovik from scratch, covering WebRTC signaling, distributed Node.js with Redis, and real-time emotion analysis.
Ollama 是一个开源的本地大模型运行框架,支持一键拉取和运行包括 Kimi-K2.6、GLM-5.1、Qwen、Gemma 等多个主流开源与商业对齐模型。 Ollama is an open-source framework for running large language models locally, enabling one-command setup and execution of multiple models including Kimi-K2.6, GLM-5.1, Qwen, Gemma, and others.
作者构建了一个存在安全漏洞的Web应用,并投入1500美元测试主流大语言模型(LLM)是否能自主发现并利用这些漏洞,结果揭示了当前LLM在真实渗透测试任务中的能力边界与局限性。 The author built a deliberately vulnerable web application and spent $1,500 testing whether mainstream LLMs could autonomously discover and exploit its security flaws—revealing practical limitations and emergent capabilities of LLMs in real-world penetration testing scenarios.
Dify 是一个面向生产环境的开源平台,用于构建和部署基于智能体(agentic)的工作流,支持可视化编排、模型集成与应用发布。 Dify is a production-ready open-source platform for developing and deploying agentic workflows, featuring visual orchestration, multi-model integration, and application publishing.
本文介绍了如何使用 Docker 容器沙箱安全地运行 AI 编程代理,防止其对宿主系统造成未授权访问或破坏,涵盖配置、权限限制和网络隔离等实操要点。 This article explains how to safely run AI coding agents using Docker sandboxing—covering container configuration, privilege restrictions, and network isolation—to prevent unauthorized host system access or damage.
LangChain 是一个用于构建基于大语言模型的应用程序的开源框架,专注于代理(agent)工程、链式调用和工具集成。 LangChain is an open-source framework for building LLM-powered applications, focused on agent engineering, chaining, and tool integration.
RepoRecon 是一款面向开发者的AI工具,旨在自动分析和理解代码仓库中的低质量或难以阅读的代码,减轻人工审查负担。 RepoRecon is an AI tool built for developers to automatically analyze and comprehend poorly written or hard-to-read code in repositories, reducing manual code review effort.
本文讨论优步对AI工具(如Claude Code)设置每月1500美元使用上限的商业决策,将其视为AI企业服务定价趋势的重要信号,反映大公司对AI成本管控的现实考量。 This piece analyzes Uber's imposition of a $1,500/month cap on AI tool usage (e.g., Claude Code) as a strategic cost-control measure and interprets it as an indicative signal for broader enterprise AI pricing dynamics.
Uruky 是一款总部位于欧盟的新型搜索引擎,作为 Kagi 的替代品,新上线了图像搜索和 URL 重写功能,并提供通过工作量证明验证码获取的 2 小时免费试用。其源代码采用源可用(source-available)许可模式,已放弃强制签署 NDA/NCC 的要求。 Uruky is an EU-based search engine positioned as a Kagi alternative, newly launching image search and URL rewriting features; it offers a 2-hour free trial via proof-of-work CAPTCHA and has shifted to a source-available licensing model—dropping mandatory NDA/NCC agreements for privacy reasons.
Open WebUI 是一个开源的、用户友好的本地化大模型交互界面,支持 Ollama、OpenAI API 等多种后端,便于快速部署和使用私有化 AI 服务。 Open WebUI is an open-source, user-friendly local interface for large language models, supporting backends like Ollama and OpenAI API, enabling rapid deployment and private AI service usage.
本文是一篇对当前AI工具过度集成编码代理功能现象的批判性评论,质疑其实际价值与用户体验合理性。 This is a critical commentary questioning the current trend of indiscriminately embedding coding agent capabilities into every AI tool, raising concerns about utility, user experience, and product coherence.
本文提出UniKE基准,首次系统评估统一多模态模型(UMMs)中跨模态知识编辑的有效性,探究文本知识编辑能否泛化至图像生成任务。 This paper introduces UniKE, the first benchmark for evaluating cross-modal knowledge editing in unified multimodal models (UMMs), investigating whether text-based knowledge edits generalize to visual generation outputs.