Node.js Fastify server that ingests news articles from RSS, SEC EDGAR 8-K filings, Alpha Vantage News Sentiment, Finnhub company news, and GDELT into a local SQLite archive.

Setup

Install dependencies:
```
npm install
```
Edit config.json with your API keys, including openRouter.apiKey, tickers, RSS feeds, and schedules.
Start the server:
```
npm start
```

API

GET /articles?q=&source=&from=&to=&limit=&offset=
GET /articles?similar_to={id}&limit=
GET /articles?topic={query}&limit=
GET /articles/:id
GET /status

Notes

SQLite archive file defaults to ./archive.sqlite.
Deduplication is enforced on url; normalized titles are stored and indexed for matching but are not unique.
newsCrawler reuses rssFeeds as the publisher catalog, derives one crawler source per feed label, and supports disabledLabels plus per-label overrides for seeds and allowed hosts.
Article body extraction runs asynchronously after insertion, with hourly retries for rows still missing content.
Main article images are stored as ultra-compressed base64 WebP.
Embeddings are generated asynchronously with OpenRouter perplexity/pplx-embed-v1-0.6b and indexed in sqlite-vec for similarity search.
Topic search caches normalized query embeddings in SQLite and falls back to OpenRouter on cache miss.
SEC requests use the configured User-Agent.