Compare commits

..

39 Commits

Author SHA1 Message Date
MkQtS
b20cf00e07 Add more cn domains (#3249)
* add growingio

* category-cdn-cn: add dfyun.com.cn

* category-collaborate-cn: add feihengip.com

* category-dev-cn: add aardio.com

* category-education-cn: add biyehome.net

* category-enterprise-query-platform-cn: add xinchacha domains

* category-media-cn: add more domains

* category-social-media-cn: add fanfou.com

* category-wiki-cn: add chaz.fun
2026-02-05 21:32:06 +08:00
jinqiang zhang
027b8b3409 dji: add djigate.com (#3248) 2026-02-05 20:20:39 +08:00
xd DG
535dc789b9 Add geosite:radiko (#3247)
* Add geosite:radiko

* Sort domains and include radiko in category-entertainment

---------

Co-authored-by: terada46 <mizukiloveu@gmail.com>
2026-02-05 17:30:18 +08:00
MkQtS
311b281000 improve codes (#3246) 2026-02-04 15:03:04 +08:00
秋野かえで
bfb35d7b68 split githubcopilot.com to github-copilot (#3245) 2026-02-04 14:34:55 +08:00
深鸣
daf4c10d0c category-entertainment-cn: add anitabi.cn (#3244) 2026-02-04 13:58:38 +08:00
深鸣
a188c2c058 geolocation-!cn: add osmand.net (#3243) 2026-02-04 13:57:46 +08:00
MkQtS
947556aa16 Improve codes (#3242)
* main.go: improve code

* main.go: move refMap from global variable to local

* main.go: allow tld to be a parent domain

* datdump: improve code
2026-02-03 22:38:18 +08:00
susaninz
44de14725e kinopub: add cdn2cdn.com, cdn2site.com, pushbr.com (#3240)
These CDN domains are used by Kinopub for:
- cdn2cdn.com: video streaming CDN
- cdn2site.com: video streaming CDN
- pushbr.com: poster/thumbnail images

Discovered via network traffic analysis on the Kinopub web app.
Without these domains proxied, poster images fail to load.

---------

Co-authored-by: Ivan Slezkin <ivanslezkin@Mac.lan>
2026-02-03 19:04:53 +08:00
sergeevms
c638ec66f0 salesforce: add salesforce-setup.com (#3239) 2026-02-02 23:31:35 +08:00
susaninz
4c8b1438f8 kinopub: add cdn-service.space (#3220)
This domain is used by the Kinopub Android TV app for version checking.
Without it, the app hangs on startup when accessed from regions where
this domain is blocked.

Discovered during network traffic analysis on 2026-01-27.
2026-02-02 23:15:54 +08:00
Emik
3399285ea9 add pjsekai.sega.jp to projectsekai (#3236) 2026-02-01 21:35:51 +08:00
⑨bingyin
62346cf6b7 Add bsappapi.com to Binance (#3235) 2026-02-01 21:30:20 +08:00
jinqiang zhang
8dee321846 qcloud: add edgeone.cool (#3237) 2026-02-01 21:28:10 +08:00
fernvenue
b117cf851f Add packages.microsoft.com to microsoft-dev. (#3234) 2026-02-01 11:58:25 +08:00
jinqiang zhang
0b6606758d add louisvuitton (#3233) 2026-01-31 18:04:21 +08:00
Blackteahamburger
fcf9c67d83 category-education-cn: add zjzs.net (#3232) 2026-01-30 19:20:41 +08:00
MkQtS
56e0b47c73 Clean up ad lists (#3231)
* category-ads-all: include adjust

* category-ads-all: include clearbit

* category-ads-all: include ogury

* category-ads-all: include openx

* category-ads-all: include pubmatic

and remove pubmatic-ads

* category-ads-all: include segment

* category-ads-all: include supersonic

* geolocation-cn: remove the inclusion of umeng

it's included in alibaba

* add unitychina

* remove unity-ads

use unity@ads or unitychina@ads instead
2026-01-30 12:10:37 +08:00
Signaliks
4f45866be4 Update cloudflare (#3229) 2026-01-29 13:34:52 +08:00
sergeevms
40d763daca Update atlassian (#3228)
* Update atlassian

* Supplement and sort

data source: https://support.atlassian.com/organization-administration/docs/ip-addresses-and-domains-for-atlassian-cloud-products/

---------

Co-authored-by: MkQtS <81752398+MkQtS@users.noreply.github.com>
2026-01-29 13:33:48 +08:00
MkQtS
6c91898557 Cleanup ad lists (#3227)
Merge ad lists containing too few rules.

merged/removed lists:

adcolony-ads applovin-ads atom-data-ads emogi-ads flurry-ads
growingio-ads hiido-ads hotjar-ads inner-active-ads mopub-ads
mxplayer-ads newrelic-ads pocoiq-ads tagtic-ads tappx-ads uberads-ads
2026-01-28 17:43:54 +08:00
MkQtS
91da593233 apple: add aod-ssl.itunes.apple.com with cn attr (#3226) 2026-01-28 16:51:48 +08:00
TripleA
9f1c6b6922 Add Bohemia Interactive and Battleye domains (#3223) 2026-01-28 16:41:32 +08:00
MkQtS
b3bae7de8f Update category-ads (#3222)
* remove ads attr from openaicom.imgix.net

imgix.net is serving for pictures, not ads/tracking

* category-ads: include more ad domains
2026-01-28 13:07:34 +08:00
Jinzhe Zeng
4e9b28f951 add crixet.com to openai (#3221)
Crixet has been acquired by OpenAI, per https://crixet.com
2026-01-28 11:49:57 +08:00
xiyao
3c0a538219 samsung: add ospserver.net (#3219)
Samsung OneUI update server
2026-01-27 16:52:46 +08:00
MkQtS
2160230ef9 terabox: add more domains (#3218) 2026-01-27 15:24:47 +08:00
MkQtS
5c38f34456 Add cmd/datdump/main.go (#3213)
* Feat: add a new datdump tool

* Refactor: address code review comments

* Refactor: remove export all from main program

use datdump instead

* Refactor: allow spaces in exportlists

e.g. `--exportlists="lista, listb"`

* all: cleanup

* apply review suggestion

---------

Co-authored-by: database64128 <free122448@hotmail.com>
2026-01-24 23:11:35 +08:00
Zeehan2005
8e62b9b541 Enhance README with additional attribute @cn details (#3212)
Expanded the explanation of attributes in the README to include domains available in China mainland.

[skip ci]
2026-01-24 16:05:37 +08:00
EpLiar
85edae7ba1 Add new Binance API endpoint 'binanceru.net' (#3210) 2026-01-23 14:57:08 +08:00
MkQtS
1bd07b2e76 Support to export all lists to a plain yml (#3211)
* Refactor: improve deduplicate

* Feat: support to export all lists to a plain yml

use: `--exportlists=_all_`

* Docs: add links for dlc plain yml

[skip ci]
2026-01-23 14:56:35 +08:00
Kusu
614a880a55 okx: add okx.cab (#3209) 2026-01-22 22:14:50 +08:00
MkQtS
676832d14a Improve value checkers and docs (#3208)
* Refactor: improve value checkers

* Docs: small improvements

[skip ci]
2026-01-22 18:46:53 +08:00
MkQtS
a2f08a142c Docs: update for selective inclusion and affiliations (#3207)
[skip ci]
2026-01-22 14:30:10 +08:00
MkQtS
2359ad7f8e Add eneba (#3205) 2026-01-22 11:40:44 +08:00
MkQtS
330592feff xiaohongshu: add rednotecdn.com (#3204) 2026-01-22 11:30:18 +08:00
blackyau
f44fbc801d category-hospital-cn: add cd120.com (#3203) 2026-01-22 10:57:38 +08:00
深鸣
03c5e05305 Add more !cn domains (#3200) 2026-01-21 09:44:07 +08:00
深鸣
bd21f84381 Add more cn domains (#3199)
* category-games-cn: add arcaea.cn
* geolocation-cn: add baimiao
2026-01-21 09:42:31 +08:00
74 changed files with 620 additions and 269 deletions

View File

@@ -33,15 +33,17 @@ jobs:
echo "TAG_NAME=$(date +%Y%m%d%H%M%S)" >> $GITHUB_ENV
shell: bash
- name: Build dlc.dat file
- name: Build dlc.dat and plain lists
run: |
cd code || exit 1
go run ./ --outputdir=../ --exportlists=category-ads-all,tld-cn,cn,tld-\!cn,geolocation-\!cn,apple,icloud
go run ./cmd/datdump/main.go --inputdata=../dlc.dat --outputdir=../ --exportlists=_all_
cd ../ && rm -rf code
- name: Generate dlc.dat sha256 hash
run: |
sha256sum dlc.dat > dlc.dat.sha256sum
sha256sum dlc.dat_plain.yml > dlc.dat_plain.yml.sha256sum
- name: Generate Zip
run: |
@@ -66,6 +68,6 @@ jobs:
- name: Release and upload assets
run: |
gh release create ${{ env.TAG_NAME }} --generate-notes --latest --title ${{ env.RELEASE_NAME }} ./dlc.dat ./dlc.dat.*
gh release create ${{ env.TAG_NAME }} --generate-notes --latest --title ${{ env.RELEASE_NAME }} ./dlc.dat ./dlc.dat.* ./dlc.dat_plain.yml ./dlc.dat_plain.yml.*
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

1
.gitignore vendored
View File

@@ -8,4 +8,5 @@
dlc.dat
# Exported plaintext lists.
/*.yml
/*.txt

View File

@@ -10,12 +10,12 @@ This project is not opinionated. In other words, it does NOT endorse, claim or i
- **dlc.dat**[https://github.com/v2fly/domain-list-community/releases/latest/download/dlc.dat](https://github.com/v2fly/domain-list-community/releases/latest/download/dlc.dat)
- **dlc.dat.sha256sum**[https://github.com/v2fly/domain-list-community/releases/latest/download/dlc.dat.sha256sum](https://github.com/v2fly/domain-list-community/releases/latest/download/dlc.dat.sha256sum)
- **dlc.dat_plain.yml**[https://github.com/v2fly/domain-list-community/releases/latest/download/dlc.dat_plain.yml](https://github.com/v2fly/domain-list-community/releases/latest/download/dlc.dat_plain.yml)
- **dlc.dat_plain.yml.sha256sum**[https://github.com/v2fly/domain-list-community/releases/latest/download/dlc.dat_plain.yml.sha256sum](https://github.com/v2fly/domain-list-community/releases/latest/download/dlc.dat_plain.yml.sha256sum)
## Notice
Rules with `@!cn` attribute has been cast out from cn lists. `geosite:geolocation-cn@!cn` is no longer available.
Check [#390](https://github.com/v2fly/domain-list-community/issues/390), [#3119](https://github.com/v2fly/domain-list-community/pull/3119) and [#3198](https://github.com/v2fly/domain-list-community/pull/3198) for more information.
Rules with `@!cn` attribute has been cast out from cn lists. `geosite:geolocation-cn@!cn` is no longer available. Check [#390](https://github.com/v2fly/domain-list-community/issues/390), [#3119](https://github.com/v2fly/domain-list-community/pull/3119) and [#3198](https://github.com/v2fly/domain-list-community/pull/3198) for more information.
Please report if you have any problems or questions.
@@ -93,38 +93,45 @@ All data are under `data` directory. Each file in the directory represents a sub
# comments
include:another-file
domain:google.com @attr1 @attr2
full:analytics.google.com @ads
keyword:google
regexp:www\.google\.com$
full:www.google.com
regexp:^odd[1-7]\.example\.org(\.[a-z]{2})?$
```
**Syntax:**
> [!NOTE]
> Adding new `regexp` and `keyword` rules is discouraged because it is easy to use them incorrectly, and proxy software cannot efficiently match these types of rules.
> [!NOTE]
> The following types of rules are **NOT** fully compatible with the ones that defined by user in V2Ray config file. Do **Not** copy and paste directly.
- Comment begins with `#`. It may begin anywhere in the file. The content in the line after `#` is treated as comment and ignored in production.
- Inclusion begins with `include:`, followed by the file name of an existing file in the same directory.
- Subdomain begins with `domain:`, followed by a valid domain name. The prefix `domain:` may be omitted.
- Keyword begins with `keyword:`, followed by a string.
- Regular expression begins with `regexp:`, followed by a valid regular expression (per Golang's standard).
- Full domain begins with `full:`, followed by a complete and valid domain name.
- Domains (including `domain`, `keyword`, `regexp` and `full`) may have one or more attributes. Each attribute begins with `@` and followed by the name of the attribute.
> **Note:** Adding new `regexp` and `keyword` rules is discouraged because it is easy to use them incorrectly, and proxy software cannot efficiently match these types of rules.
- Keyword begins with `keyword:`, followed by a substring of a valid domain name.
- Regular expression begins with `regexp:`, followed by a valid regular expression (per Golang's standard).
- Domain rules (including `domain`, `full`, `keyword`, and `regexp`) may have none, one or more attributes. Each attribute begins with `@` and followed by the name of the attribute. Attributes will remain available in final lists and `dlc.dat`.
- Domain rules may have none, one or more affiliations, which additionally adds the domain rule into the affiliated target list. Each affiliation begins with `&` and followed by the name of the target list (nomatter whether the target has a dedicated file in data path). This is a method for data management, and will not remain in the final lists or `dlc.dat`.
- Inclusion begins with `include:`, followed by the name of another valid domain list. A simple `include:listb` in file `lista` means adding all domain rules of `listb` into `lista`. Inclusions with attributes stands for selective inclusion. `include:listb @attr1 @-attr2` means only adding those domain rules *with* `@attr1` **and** *without* `@attr2`. This is a special type for data management, and will not remain in the final lists or `dlc.dat`.
## How it works
The entire `data` directory will be built into an external `geosite` file for Project V. Each file in the directory represents a section in the generated file.
To generate a section:
**General steps:**
1. Remove all the comments in the file.
2. Replace `include:` lines with the actual content of the file.
3. Omit all empty lines.
4. Generate each `domain:` line into a [sub-domain routing rule](https://github.com/v2fly/v2ray-core/blob/master/app/router/routercommon/common.proto#L21).
5. Generate each `full:` line into a [full domain routing rule](https://github.com/v2fly/v2ray-core/blob/master/app/router/routercommon/common.proto#L23).
6. Generate each `keyword:` line into a [plain domain routing rule](https://github.com/v2fly/v2ray-core/blob/master/app/router/routercommon/common.proto#L17).
7. Generate each `regexp:` line into a [regex domain routing rule](https://github.com/v2fly/v2ray-core/blob/master/app/router/routercommon/common.proto#L19).
1. Read files in the data path (ignore all comments and empty lines).
2. Parse and resolve source data, turn affiliations and inclusions into actual domain rules in proper lists.
3. Deduplicate and sort rules in every list.
4. Export desired plain text lists.
5. Generate `dlc.dat`:
- turn each `domain:` line into a [sub-domain routing rule](https://github.com/v2fly/v2ray-core/blob/master/app/router/routercommon/common.proto#L21).
- turn each `full:` line into a [full domain routing rule](https://github.com/v2fly/v2ray-core/blob/master/app/router/routercommon/common.proto#L23).
- turn each `keyword:` line into a [plain domain routing rule](https://github.com/v2fly/v2ray-core/blob/master/app/router/routercommon/common.proto#L17).
- turn each `regexp:` line into a [regex domain routing rule](https://github.com/v2fly/v2ray-core/blob/master/app/router/routercommon/common.proto#L19).
Read [main.go](./main.go) for details.
## How to organize domains
@@ -134,7 +141,7 @@ Theoretically any string can be used as the name, as long as it is a valid file
### Attributes
Attribute is useful for sub-group of domains, especially for filtering purpose. For example, the list of `google` domains may contains its main domains, as well as domains that serve ads. The ads domains may be marked by attribute `@ads`, and can be used as `geosite:google@ads` in V2Ray routing.
Attribute is useful for sub-group of domains, especially for filtering purpose. For example, the list of `google` may contains its main domains, as well as domains that serve ads. The ads domains may be marked by attribute `@ads`, and can be used as `geosite:google@ads` in V2Ray routing. Domains and services that originate from outside China mainland but have access point in China mainland, may be marked by attribute `@cn`.
## Contribution guideline

177
cmd/datdump/main.go Normal file
View File

@@ -0,0 +1,177 @@
package main
import (
"bufio"
"flag"
"fmt"
"os"
"path/filepath"
"strings"
"github.com/v2fly/domain-list-community/internal/dlc"
router "github.com/v2fly/v2ray-core/v5/app/router/routercommon"
"google.golang.org/protobuf/proto"
)
var (
inputData = flag.String("inputdata", "dlc.dat", "Name of the geosite dat file")
outputDir = flag.String("outputdir", "./", "Directory to place all generated files")
exportLists = flag.String("exportlists", "", "Lists to be exported, separated by ',' (empty for _all_)")
)
type DomainRule struct {
Type string
Value string
Attrs []string
}
type DomainList struct {
Name string
Rules []DomainRule
}
func (d *DomainRule) domain2String() string {
var dstr strings.Builder
dstr.Grow(len(d.Type) + len(d.Value) + 10)
dstr.WriteString(d.Type)
dstr.WriteByte(':')
dstr.WriteString(d.Value)
for i, attr := range d.Attrs {
if i == 0 {
dstr.WriteByte(':')
} else {
dstr.WriteByte(',')
}
dstr.WriteByte('@')
dstr.WriteString(attr)
}
return dstr.String()
}
func loadGeosite(path string) ([]DomainList, map[string]*DomainList, error) {
data, err := os.ReadFile(path)
if err != nil {
return nil, nil, fmt.Errorf("failed to read geosite file: %w", err)
}
vgeositeList := new(router.GeoSiteList)
if err := proto.Unmarshal(data, vgeositeList); err != nil {
return nil, nil, fmt.Errorf("failed to unmarshal: %w", err)
}
domainLists := make([]DomainList, len(vgeositeList.Entry))
domainListByName := make(map[string]*DomainList, len(vgeositeList.Entry))
for i, vsite := range vgeositeList.Entry {
rules := make([]DomainRule, 0, len(vsite.Domain))
for _, vdomain := range vsite.Domain {
rule := DomainRule{Value: vdomain.Value}
switch vdomain.Type {
case router.Domain_RootDomain:
rule.Type = dlc.RuleTypeDomain
case router.Domain_Regex:
rule.Type = dlc.RuleTypeRegexp
case router.Domain_Plain:
rule.Type = dlc.RuleTypeKeyword
case router.Domain_Full:
rule.Type = dlc.RuleTypeFullDomain
default:
return nil, nil, fmt.Errorf("invalid rule type: %+v", vdomain.Type)
}
for _, vattr := range vdomain.Attribute {
rule.Attrs = append(rule.Attrs, vattr.Key)
}
rules = append(rules, rule)
}
domainLists[i] = DomainList{
Name: strings.ToUpper(vsite.CountryCode),
Rules: rules,
}
domainListByName[domainLists[i].Name] = &domainLists[i]
}
return domainLists, domainListByName, nil
}
func exportSite(name string, domainListByName map[string]*DomainList) error {
domainList, ok := domainListByName[strings.ToUpper(name)]
if !ok {
return fmt.Errorf("list %q does not exist", name)
}
if len(domainList.Rules) == 0 {
return fmt.Errorf("list %q is empty", name)
}
file, err := os.Create(filepath.Join(*outputDir, name+".yml"))
if err != nil {
return err
}
defer file.Close()
w := bufio.NewWriter(file)
fmt.Fprintf(w, "%s:\n", name)
for _, domain := range domainList.Rules {
fmt.Fprintf(w, " - %q\n", domain.domain2String())
}
return w.Flush()
}
func exportAll(filename string, domainLists []DomainList) error {
file, err := os.Create(filepath.Join(*outputDir, filename))
if err != nil {
return err
}
defer file.Close()
w := bufio.NewWriter(file)
w.WriteString("lists:\n")
for _, domainList := range domainLists {
fmt.Fprintf(w, " - name: %s\n", strings.ToLower(domainList.Name))
fmt.Fprintf(w, " length: %d\n", len(domainList.Rules))
w.WriteString(" rules:\n")
for _, domain := range domainList.Rules {
fmt.Fprintf(w, " - %q\n", domain.domain2String())
}
}
return w.Flush()
}
func run() error {
// Make sure output directory exists
if err := os.MkdirAll(*outputDir, 0755); err != nil {
return fmt.Errorf("failed to create output directory: %w", err)
}
fmt.Printf("loading source data %q...\n", *inputData)
domainLists, domainListByName, err := loadGeosite(*inputData)
if err != nil {
return fmt.Errorf("failed to loadGeosite: %w", err)
}
var exportListSlice []string
for raw := range strings.SplitSeq(*exportLists, ",") {
if trimmed := strings.TrimSpace(raw); trimmed != "" {
exportListSlice = append(exportListSlice, trimmed)
}
}
if len(exportListSlice) == 0 {
exportListSlice = []string{"_all_"}
}
for _, eplistname := range exportListSlice {
if strings.EqualFold(eplistname, "_all_") {
if err := exportAll(filepath.Base(*inputData)+"_plain.yml", domainLists); err != nil {
fmt.Printf("failed to exportAll: %v\n", err)
continue
}
} else {
if err := exportSite(eplistname, domainListByName); err != nil {
fmt.Printf("failed to exportSite: %v\n", err)
continue
}
}
fmt.Printf("list: %q has been exported successfully.\n", eplistname)
}
return nil
}
func main() {
flag.Parse()
if err := run(); err != nil {
fmt.Printf("Fatal error: %v\n", err)
os.Exit(1)
}
}

View File

@@ -1 +0,0 @@
adcolony.com @ads

View File

@@ -1,4 +1,4 @@
adjust.com @ads
adjust.net.in @ads
adjust.io @ads
adjust.net.in @ads
adjust.world @ads

View File

@@ -756,6 +756,7 @@ full:amp-api-edge.apps.apple.com @cn
full:amp-api-search-edge.apps.apple.com @cn
full:amp-api.apps.apple.com @cn
full:amp-api.music.apple.com @cn
full:aod-ssl.itunes.apple.com @cn
full:aod.itunes.apple.com @cn
full:api-edge.apps.apple.com @cn
full:apptrailers.itunes.apple.com @cn

View File

@@ -1,2 +0,0 @@
applovin.com @ads
applvn.com @ads

View File

@@ -1,7 +1,11 @@
include:trello
atl-paas.net
atlassian-dev.net
atlassian.com
atlassian.net
bitbucket.io
bitbucket.org
jira.com
ss-inf.net
statuspage.io
include:trello

View File

@@ -1,3 +0,0 @@
atom-data.io @ads
analytics-data.io @ads
ironbeast.io @ads

View File

@@ -28,8 +28,10 @@ binancezh.top
# API
binanceapi.com
binanceru.net
bnbstatic.com
bntrace.com
bsappapi.com
nftstatic.com
# saas

9
data/bohemia Normal file
View File

@@ -0,0 +1,9 @@
arma3.com
armaplatform.com
bistudio.com
bohemia.net
dayz.com
makearmanotwar.com
silicagame.com
vigorgame.com
ylands.com

View File

@@ -1,29 +1,21 @@
# This file contains domains that clearly serving ads
include:acfun-ads
include:adcolony-ads
include:adjust-ads
include:adobe-ads
include:alibaba-ads
include:amazon-ads
include:apple-ads
include:applovin-ads
include:atom-data-ads
include:baidu-ads
include:bytedance-ads
include:category-ads-ir
include:cctv @ads
include:clearbit-ads
include:disney @ads
include:dmm-ads
include:duolingo-ads
include:emogi-ads
include:flurry-ads
include:gamersky @ads
include:google-ads
include:growingio-ads
include:hiido-ads
include:hotjar-ads
include:hetzner @ads
include:hunantv-ads
include:inner-active-ads
include:iqiyi-ads
include:jd-ads
include:kuaishou-ads
@@ -31,30 +23,25 @@ include:kugou-ads
include:letv-ads
include:meta-ads
include:microsoft-ads
include:mopub-ads
include:mxplayer-ads
include:netease-ads
include:newrelic-ads
include:ogury-ads
include:ookla-speedtest-ads
include:openx-ads
include:openai @ads
include:picacg @ads
include:pocoiq-ads
include:pubmatic-ads
include:pikpak @ads
include:pixiv @ads
include:qihoo360-ads
include:segment-ads
include:samsung @ads
include:sina-ads
include:snap @ads
include:sohu-ads
include:spotify-ads
include:supersonic-ads
include:tagtic-ads
include:tappx-ads
include:television-ads
include:tencent-ads
include:tendcloud @ads
include:uberads-ads
include:twitter @ads
include:umeng-ads
include:unity-ads
include:unity @ads
include:unitychina @ads
include:xhamster-ads
include:xiaomi-ads
include:ximalaya-ads
@@ -82,21 +69,26 @@ cdn.banclip.com
cfts1tifqr.com
contentabc.com
cretgate.com
data.flurry.com
decide.mixpanel.com
emogi.com
ero-advertising.com
eroadvertising.com
evt.mxplay.com
exoclick.com
exosrv.com
go2.global
gozendata.com
gzads.com
gz-data.com
gzads.com
img-bss.csdn.net
imglnkc.com
imglnkd.com
inner-active.mobi
innovid.com
jads.co
jl3.yjaxa.top
js-agent.newrelic.com
juicyads.com
kepler-37b.com
leanplum.com
@@ -104,22 +96,26 @@ lqc006.com
moat.com
moatads.com
mobwithad.com
mopub.com
onesignal.com
realsrv.com
s4yxaqyq95.com
shhs-ydd8x2.yjrmss.cn
ssp.api.tappx.com
static.hotjar.com
static.javhd.com
tm-banners.gamingadult.com
trafficfactory.biz
tsyndicate.com
uberads.com
wwads.cn
# 36Kr
adx.36kr.com
# 12306
ad.12306.cn
# 36Kr
adx.36kr.com
# AdHub
hubcloud.com.cn
@@ -130,6 +126,10 @@ beizi.biz
click.ali213.net
pbmp.ali213.net
# AppLovin
applovin.com
applvn.com
# Caixin
# regexp:^pinggai\d\.caixin\.com$
full:pinggai0.caixin.com
@@ -147,12 +147,29 @@ full:pinggai9.caixin.com
adq.chinaso.com
stat.chinaso.com
# hiido
mlog.hiido.com
ylog.hiido.com
# Httpool
toboads.com
# ironSource Atom
analytics-data.io
atom-data.io
ironbeast.io
# pocoiq
cdn.pocoiq.cn
oct.pocoiq.cn
# Qiniu
dn-growing.qbox.me
# tagtic
g1.tagtic.cn
xy-log.tagtic.cn
# UNI Marketing
ad.unimhk.com

View File

@@ -1,30 +1,35 @@
# This file contains domains of all ads providers, including both the domains that serves ads, and the domains of providers themselves.
include:category-ads
include:adjust
include:clearbit
include:growingio
include:ogury
include:openx
include:pubmatic
include:segment
include:supersonic
include:taboola
1rx.io @ads
7box.vip @ads
ad-delivery.net @ads
adcolony.com @ads
adinplay.com @ads
adnxs.com @ads
adview.cn @ads
ads.trafficjunky.net @ads
advertserve.com @ads
adview.cn @ads
casalemedia.com @ads
contextual.media.net @ads
cpmstar.com @ads
demdex.net @ads
httpool.com @ads
lijit.com @ads
1rx.io @ads
mfadsrvr.com @ads
mgid.com @ads
ns1p.net @ads
pubmatic.com @ads
sigmob.com @ads
snapads.com @ads
spotxchange.com @ads
unimhk.com @ads
upapi.net @ads
include:taboola
include:category-ads

View File

@@ -4,6 +4,7 @@ include:cerebras
include:comfy
include:cursor
include:elevenlabs
include:github-copilot
include:google-deepmind
include:groq
include:huggingface

View File

@@ -6,12 +6,14 @@ include:qiniu
include:upai
include:wangsu
## 创世云
# 创世云
chuangcache.com
chuangcdn.com
## FUNCDN
# 大风云CDN
dfyun.com.cn
# FUNCDN
funcdn.com
## 北京知道创宇信息技术股份有限公司
# 北京知道创宇信息技术股份有限公司
jiashule.com
jiasule.com
yunaq.com

View File

@@ -4,6 +4,8 @@
asklink.com
## EasyTier
easytier.cn
## 飞衡HTTP
feihengip.com
## Oray
oray.com
oray.net

View File

@@ -48,6 +48,7 @@ include:kakao
include:kaspersky
include:lg
include:logitech
include:louisvuitton
include:mailru-group
include:meta
include:microsoft

View File

@@ -18,7 +18,9 @@ include:segmentfault
include:sxl
include:tencent-dev
include:ubuntukylin
include:unitychina
aardio.com
jinrishici.com
openvela.com
tipdm.org

View File

@@ -71,6 +71,8 @@ baicizhan.com
baicizhan.org
bczcdn.com
bczeducation.cn
# 毕业之家科研服务平台
biyehome.net
# Burning Vocabulary
burningvocabulary.cn
burningvocabulary.com
@@ -142,3 +144,5 @@ ystbds.com
zhan.com
# 智慧树
zhihuishu.com
# 浙江省教育考试院
zjzs.net

View File

@@ -2,6 +2,9 @@ include:playcover
include:fflogs
include:trackernetwork
# Anti-Cheat
battleye.com
# Android Emulator
bluestacks.com
ldmnq.com @cn
@@ -16,5 +19,5 @@ prts.plus
heavenlywind.cc @cn
poi.moe
# Steam++ / Watt Toolkit
steampp.net @cn

View File

@@ -6,3 +6,7 @@ include:tianyancha
qichamao.com
qyyjt.cn
x315.com
# 信查查
xcc.cn
xinchacha.com

View File

@@ -54,6 +54,7 @@ include:pixiv
include:plutotv
include:pocketcasts
include:primevideo
include:radiko
include:roku
include:showtimeanytime
include:sling

View File

@@ -50,6 +50,8 @@ yeshen.com
51zmt.top
# 广东南方新媒体
aisee.tv
# 动画巡礼
anitabi.cn
# 暴风影音
baofeng.com
baofeng.net

View File

@@ -1,10 +1,12 @@
include:2kgames
include:blizzard
include:bluearchive
include:bohemia
include:curseforge
include:cygames
include:ea
include:embark
include:eneba
include:epicgames
include:escapefromtarkov
include:faceit

View File

@@ -16,6 +16,8 @@ include:yokaverse
7k7k.com
# 刀锋盒子 皖B2-20190103-4
9xgame.com
# 韵律谱面研究站 桂ICP备20001846号-3
arcaea.cn
# 《异象回声》游戏官网 沪ICP备2023010411号-1
astral-vector.com
# 九九互动 粤ICP备19068416号

View File

@@ -10,3 +10,6 @@ yctdyy.com
# 南方医科大学深圳医院
smuszh.com
# 四川大学华西医院
cd120.com

View File

@@ -78,6 +78,8 @@ freebuf.com
geekpark.net
# 光明网
gmw.com
# 硅谷网
guigu.org
# 和讯
hexun.com
# 河南广播电视台/大象网
@@ -134,6 +136,9 @@ xinhuanet.com
xinhuaxmt.com
# 维科网
ofweek.com
# PChome电脑之家
pchome.net
pchpic.net
# PConline 太平洋科技
3conline.com
pconline.com.cn

View File

@@ -1,26 +1,29 @@
# This list contains social media platforms inside China mainland.
include:coolapk
include:douban
include:gracg
include:hupu
include:meipian
include:okjike
include:sina @-!cn
include:xiaohongshu
include:yy
include:zhihu
tieba.baidu.com
tieba.com
# 杭州蛋蛋语音科技有限公司
dandan818.com
dandanvoice.com
# 脉脉
maimai.cn
taou.com
# 知识星球
zsxq.com
# This list contains social media platforms inside China mainland.
include:coolapk
include:douban
include:gracg
include:hupu
include:meipian
include:okjike
include:sina @-!cn
include:xiaohongshu
include:yy
include:zhihu
tieba.baidu.com
tieba.com
# 杭州蛋蛋语音科技有限公司
dandan818.com
dandanvoice.com
# 饭否
fanfou.com
# 脉脉
maimai.cn
taou.com
# 知识星球
zsxq.com

View File

@@ -4,6 +4,9 @@ mbalib.com
sec-wiki.com
shidianbaike.com
# 叉子周 手机博物馆
chaz.fun
# huijiwiki
huijistatic.com
huijiwiki.com

View File

@@ -41,6 +41,9 @@ cloudflarewarp.com
cloudflareworkers.com
encryptedsni.com
every1dns.net
foundationdns.com
foundationdns.net
foundationdns.org
imagedelivery.net
isbgpsafeyet.com
one.one.one

View File

@@ -2,6 +2,7 @@ dji.com
dji.ink
dji.net
djicdn.com
djigate.com
djiits.com
djiops.com
djiservice.org

View File

@@ -1 +0,0 @@
emogi.com @ads

2
data/eneba Normal file
View File

@@ -0,0 +1,2 @@
eneba.com
eneba.games

View File

@@ -1 +0,0 @@
data.flurry.com @ads

View File

@@ -102,6 +102,8 @@ include:w3schools
include:zotero
chemequations.com # 线上化学方程式!
geogebra.org
wolframalpha.com
# Entertainment & Games & Music & Podcasts & Videos
include:category-entertainment
@@ -269,6 +271,8 @@ ldoceonline.com
immersivetranslate.com # 沉浸式翻译 (国际版)
## OriginLab (Graphing for Science and Engineering)
originlab.com
## OsmAnd
osmand.net
# Software development
include:category-dev
@@ -300,6 +304,7 @@ include:wikimedia
atwiki.jp
touhouwiki.net
wiki.gg
# Others
include:avaxhome

View File

@@ -23,8 +23,8 @@ include:category-social-media-cn
# Advertisment & Analytics
include:getui
include:growingio
include:jiguang
include:umeng
# 神策数据
sensorsdata.cn
@@ -488,6 +488,12 @@ include:chinaunicom @-!cn
## IPIP ip地理位置数据库
include:ipip @-!cn
## 白描
baimiao.tech
baimiaoapp.com
shinescan.tech
uzero.cn
chaziyu.com # 滇ICP备2024035496号
fofa.info # Fofa网站测绘华顺信安
icplishi.com # 粤ICP备20009057号
@@ -658,7 +664,6 @@ ycrx360.com
9ht.com
9xu.com
a9vg.com
aardio.com # 皖ICP备09012014号
acetaffy.club # 粤ICP备2022042304号
adxvip.com
afzhan.com
@@ -714,7 +719,6 @@ bio-equip.com
biodiscover.com
bishijie.com
bitecoin.com
biyehome.net
bjcathay.com
bobo.com
bojianger.com
@@ -738,7 +742,6 @@ chachaba.com
changba.com
chaojituzi.net
chashebao.com
chaz.fun # 粤ICP备2022001828号-2
chazhengla.com
chazidian.com
che168.com
@@ -874,7 +877,6 @@ fanli.com
fangxiaoer.com
fanxian.com
fastapi.net
feihengip.com # 粤ICP备2023115330号-1
feihuo.com
feiniaomy.com
fengniao.com
@@ -898,7 +900,6 @@ gdrc.com
geektool.top # 极客Tool 蜀ICP备2024086015号-2
gezida.com
gfan.com
giocdn.com
globrand.com
gm86.com
gmz88.com
@@ -909,7 +910,6 @@ gongxiangcj.com
goosail.com
goufw.com
greenxiazai.com
growingio.com
gtags.net
guabu.com
guaiguai.com
@@ -917,7 +917,6 @@ guanaitong.com
guanhaobio.com
guanyierp.com # 沪ICP备14043335号-8
gucheng.com
guigu.org
guoxinmac.com
gupzs.com
gushiwen.org
@@ -1173,7 +1172,6 @@ p5w.net
paipaibang.com
paopaoche.net
pc6.com
pchome.net
pcpop.com
peccn.com
pgzs.com

View File

@@ -1,4 +1,5 @@
include:github-ads
include:github-copilot
include:npmjs
atom.io
@@ -14,7 +15,6 @@ github.dev
github.io
githubapp.com
githubassets.com
githubcopilot.com
githubhackathon.com
githubnext.com
githubpreview.dev

1
data/github-copilot Normal file
View File

@@ -0,0 +1 @@
githubcopilot.com

7
data/growingio Normal file
View File

@@ -0,0 +1,7 @@
# 北京易数科技
datayi.cn
gio.ren
giocdn.com
growin.cn
growingio.cn
growingio.com

View File

@@ -1 +0,0 @@
assets.growingio.com @ads

View File

@@ -1,2 +0,0 @@
mlog.hiido.com @ads
ylog.hiido.com @ads

View File

@@ -1 +0,0 @@
static.hotjar.com @ads

View File

@@ -1 +0,0 @@
inner-active.mobi @ads

View File

@@ -6,4 +6,9 @@ gfw.ovh # sub domains mirror
mos-gorsud.co # kinopub domain to generate a mirror site through gfw.ovh
# kinopub CDN servers
cdn-service.space
cdn2cdn.com
cdn2site.com
pushbr.com # poster images CDN
regexp:(\w+)-static-[0-9]+\.cdntogo\.net$

5
data/louisvuitton Normal file
View File

@@ -0,0 +1,5 @@
louisvuitton.cn @cn
louisvuitton.com
lvcampaign.com @cn
full:tp.louisvuitton.com @cn

View File

@@ -60,6 +60,7 @@ full:default.exp-tas.com
full:developer.microsoft.com
full:download.visualstudio.microsoft.com
full:dtlgalleryint.cloudapp.net
full:packages.microsoft.com
full:poshtestgallery.cloudapp.net
full:psg-int-centralus.cloudapp.net
full:psg-int-eastus.cloudapp.net

View File

@@ -1 +0,0 @@
mopub.com @ads

View File

@@ -1 +0,0 @@
evt.mxplay.com @ads

View File

@@ -1 +0,0 @@
js-agent.newrelic.com @ads

View File

@@ -1,3 +1,3 @@
ogury.co @ads
ogury.com @ads
presage.io @ads
ogury.co @ads

View File

@@ -1,8 +1,9 @@
okex.com
okx.com
okx-dns.com
okx-dns1.com
okx-dns2.com
okx.cab
okx.com
# OKC Browser
oklink.com @cn

View File

@@ -1,6 +1,7 @@
# Main domain
chatgpt.com
chat.com
chatgpt.com
crixet.com
oaistatic.com
oaiusercontent.com
openai.com
@@ -10,13 +11,13 @@ sora.com
openai.com.cdn.cloudflare.net
full:openaiapi-site.azureedge.net
full:openaicom-api-bdcpf8c6d2e9atf6.z01.azurefd.net
full:openaicom.imgix.net
full:openaicomproductionae4b.blob.core.windows.net
full:production-openaicom-storage.azureedge.net
regexp:^chatgpt-async-webps-prod-\S+-\d+\.webpubsub\.azure\.com$
# tracking
full:o33249.ingest.sentry.io @ads
full:openaicom.imgix.net @ads
full:browser-intake-datadoghq.com @ads
# Advanced Voice

View File

@@ -1,2 +0,0 @@
cdn.pocoiq.cn @ads
oct.pocoiq.cn @ads

View File

@@ -1 +1,2 @@
sekai.colorfulpalette.org
pjsekai.sega.jp

View File

@@ -2,5 +2,3 @@
pubmatic.com
pubmatic.co.jp
include:pubmatic-ads

View File

@@ -1 +0,0 @@
ads.pubmatic.com @ads

View File

@@ -44,6 +44,7 @@ dnsv1.com.cn
dothework.cn
ectencent.cn
ectencent.com.cn
edgeone.cool
edgeonedy1.com
essurl.com
exmailgz.com

5
data/radiko Normal file
View File

@@ -0,0 +1,5 @@
# radiko official access and streaming domains
radiko-cf.com
radiko.jp
smartstream.ne.jp

View File

@@ -24,6 +24,7 @@ pardot.com
quotable.com
radian6.com
relateiq.com
salesforce-setup.com
salesforce.com
salesforce.org
salesforceiq.com

View File

@@ -8,6 +8,7 @@ galaxyappstore.com
galaxymobile.jp
game-platform.net
knoxemm.com
ospserver.net
samsung.com
samsungads.com @ads
samsungapps.com

View File

@@ -1,4 +1,5 @@
ssacdn.com @ads
supersonic.com @ads
supersonicads.com @ads
ssacdn.com @ads
supersonicads-a.akamaihd.net @ads

View File

@@ -1,2 +0,0 @@
g1.tagtic.cn @ads
xy-log.tagtic.cn @ads

View File

@@ -1 +0,0 @@
ssp.api.tappx.com @ads

View File

@@ -1,2 +1,7 @@
1024terabox.com
bestclouddrive.com
freeterabox.com
nephobox.com
terabox.com
terabox1024.com
teraboxcdn.com

View File

@@ -1 +0,0 @@
uberads.com @ads

View File

@@ -1,4 +1,6 @@
unity.com
unity3d.com
include:unity-ads
# Ads/tracking
iads.unity3d.com @ads
unityads.unity3d.com @ads

View File

@@ -1,6 +1,11 @@
# 优三缔 / 优美缔 / 团结引擎
u3d.cn
unity.cn
unitychina.cn
# Ads/tracking
ads.unitychina.cn @ads
splash-ads.cdn.unity.cn @ads
splash-ads.unitychina.cn @ads
unityads.unity.cn @ads
unityads.unity3d.com @ads
unityads.unitychina.cn @ads

View File

@@ -1,5 +1,6 @@
include:askdiandian
rednotecdn.com
xhscdn.com
xhscdn.net
xhslink.com

9
internal/dlc/dlc.go Normal file
View File

@@ -0,0 +1,9 @@
package dlc
const (
RuleTypeDomain string = "domain"
RuleTypeFullDomain string = "full"
RuleTypeKeyword string = "keyword"
RuleTypeRegexp string = "regexp"
RuleTypeInclude string = "include"
)

321
main.go
View File

@@ -10,6 +10,7 @@ import (
"slices"
"strings"
"github.com/v2fly/domain-list-community/internal/dlc"
router "github.com/v2fly/v2ray-core/v5/app/router/routercommon"
"google.golang.org/protobuf/proto"
)
@@ -21,23 +22,7 @@ var (
exportLists = flag.String("exportlists", "", "Lists to be flattened and exported in plaintext format, separated by ',' comma")
)
const (
RuleTypeDomain string = "domain"
RuleTypeFullDomain string = "full"
RuleTypeKeyword string = "keyword"
RuleTypeRegexp string = "regexp"
RuleTypeInclude string = "include"
)
var (
TypeChecker = regexp.MustCompile(`^(domain|full|keyword|regexp|include)$`)
ValueChecker = regexp.MustCompile(`^[a-z0-9!\.-]+$`)
AttrChecker = regexp.MustCompile(`^[a-z0-9!-]+$`)
SiteChecker = regexp.MustCompile(`^[A-Z0-9!-]+$`)
)
var (
refMap = make(map[string][]*Entry)
plMap = make(map[string]*ParsedList)
finalMap = make(map[string][]*Entry)
cirIncMap = make(map[string]bool) // Used for circular inclusion detection
@@ -66,7 +51,7 @@ type ParsedList struct {
func makeProtoList(listName string, entries []*Entry) (*router.GeoSite, error) {
site := &router.GeoSite{
CountryCode: listName,
Domain: make([]*router.Domain, 0, len(entries)),
Domain: make([]*router.Domain, 0, len(entries)),
}
for _, entry := range entries {
pdomain := &router.Domain{Value: entry.Value}
@@ -78,13 +63,13 @@ func makeProtoList(listName string, entries []*Entry) (*router.GeoSite, error) {
}
switch entry.Type {
case RuleTypeDomain:
case dlc.RuleTypeDomain:
pdomain.Type = router.Domain_RootDomain
case RuleTypeRegexp:
case dlc.RuleTypeRegexp:
pdomain.Type = router.Domain_Regex
case RuleTypeKeyword:
case dlc.RuleTypeKeyword:
pdomain.Type = router.Domain_Plain
case RuleTypeFullDomain:
case dlc.RuleTypeFullDomain:
pdomain.Type = router.Domain_Full
}
site.Domain = append(site.Domain, pdomain)
@@ -92,18 +77,14 @@ func makeProtoList(listName string, entries []*Entry) (*router.GeoSite, error) {
return site, nil
}
func writePlainList(exportedName string) error {
targetList, exist := finalMap[strings.ToUpper(exportedName)]
if !exist || len(targetList) == 0 {
return fmt.Errorf("'%s' list does not exist or is empty.", exportedName)
}
file, err := os.Create(filepath.Join(*outputDir, strings.ToLower(exportedName) + ".txt"))
func writePlainList(listname string, entries []*Entry) error {
file, err := os.Create(filepath.Join(*outputDir, strings.ToLower(listname)+".txt"))
if err != nil {
return err
}
defer file.Close()
w := bufio.NewWriter(file)
for _, entry := range targetList {
for _, entry := range entries {
fmt.Fprintln(w, entry.Plain)
}
return w.Flush()
@@ -112,83 +93,134 @@ func writePlainList(exportedName string) error {
func parseEntry(line string) (Entry, error) {
var entry Entry
parts := strings.Fields(line)
if len(parts) == 0 {
return entry, fmt.Errorf("empty line")
}
// Parse type and value
rawTypeVal := parts[0]
kv := strings.Split(rawTypeVal, ":")
if len(kv) == 1 {
entry.Type = RuleTypeDomain // Default type
entry.Value = strings.ToLower(rawTypeVal)
} else if len(kv) == 2 {
entry.Type = strings.ToLower(kv[0])
if entry.Type == RuleTypeRegexp {
entry.Value = kv[1]
} else {
entry.Value = strings.ToLower(kv[1])
v := parts[0]
colonIndex := strings.Index(v, ":")
if colonIndex == -1 {
entry.Type = dlc.RuleTypeDomain // Default type
entry.Value = strings.ToLower(v)
if !validateDomainChars(entry.Value) {
return entry, fmt.Errorf("invalid domain: %q", entry.Value)
}
} else {
return entry, fmt.Errorf("invalid format: %s", line)
}
// Check type and value
if !TypeChecker.MatchString(entry.Type) {
return entry, fmt.Errorf("invalid type: %s", entry.Type)
}
if entry.Type == RuleTypeRegexp {
if _, err := regexp.Compile(entry.Value); err != nil {
return entry, fmt.Errorf("invalid regexp: %s", entry.Value)
typ := strings.ToLower(v[:colonIndex])
val := v[colonIndex+1:]
switch typ {
case dlc.RuleTypeRegexp:
if _, err := regexp.Compile(val); err != nil {
return entry, fmt.Errorf("invalid regexp %q: %w", val, err)
}
entry.Type = dlc.RuleTypeRegexp
entry.Value = val
case dlc.RuleTypeInclude:
entry.Type = dlc.RuleTypeInclude
entry.Value = strings.ToUpper(val)
if !validateSiteName(entry.Value) {
return entry, fmt.Errorf("invalid include list name: %q", entry.Value)
}
case dlc.RuleTypeDomain, dlc.RuleTypeFullDomain, dlc.RuleTypeKeyword:
entry.Type = typ
entry.Value = strings.ToLower(val)
if !validateDomainChars(entry.Value) {
return entry, fmt.Errorf("invalid domain: %q", entry.Value)
}
default:
return entry, fmt.Errorf("invalid type: %q", typ)
}
} else if !ValueChecker.MatchString(entry.Value) {
return entry, fmt.Errorf("invalid value: %s", entry.Value)
}
// Parse/Check attributes and affiliations
// Parse attributes and affiliations
for _, part := range parts[1:] {
if strings.HasPrefix(part, "@") {
attr := strings.ToLower(part[1:]) // Trim attribute prefix `@` character
if !AttrChecker.MatchString(attr) {
return entry, fmt.Errorf("invalid attribute key: %s", attr)
if !validateAttrChars(attr) {
return entry, fmt.Errorf("invalid attribute: %q", attr)
}
entry.Attrs = append(entry.Attrs, attr)
} else if strings.HasPrefix(part, "&") {
aff := strings.ToUpper(part[1:]) // Trim affiliation prefix `&` character
if !SiteChecker.MatchString(aff) {
return entry, fmt.Errorf("invalid affiliation key: %s", aff)
if !validateSiteName(aff) {
return entry, fmt.Errorf("invalid affiliation: %q", aff)
}
entry.Affs = append(entry.Affs, aff)
} else {
return entry, fmt.Errorf("invalid attribute/affiliation: %s", part)
return entry, fmt.Errorf("invalid attribute/affiliation: %q", part)
}
}
// Sort attributes
slices.Sort(entry.Attrs)
// Formated plain entry: type:domain.tld:@attr1,@attr2
entry.Plain = entry.Type + ":" + entry.Value
if len(entry.Attrs) != 0 {
entry.Plain = entry.Plain + ":@" + strings.Join(entry.Attrs, ",@")
var plain strings.Builder
plain.Grow(len(entry.Type) + len(entry.Value) + 10)
plain.WriteString(entry.Type)
plain.WriteByte(':')
plain.WriteString(entry.Value)
for i, attr := range entry.Attrs {
if i == 0 {
plain.WriteByte(':')
} else {
plain.WriteByte(',')
}
plain.WriteByte('@')
plain.WriteString(attr)
}
entry.Plain = plain.String()
return entry, nil
}
func loadData(path string) error {
func validateDomainChars(domain string) bool {
for i := range domain {
c := domain[i]
if (c >= 'a' && c <= 'z') || (c >= '0' && c <= '9') || c == '.' || c == '-' {
continue
}
return false
}
return true
}
func validateAttrChars(attr string) bool {
for i := range attr {
c := attr[i]
if (c >= 'a' && c <= 'z') || (c >= '0' && c <= '9') || c == '!' || c == '-' {
continue
}
return false
}
return true
}
func validateSiteName(name string) bool {
for i := range name {
c := name[i]
if (c >= 'A' && c <= 'Z') || (c >= '0' && c <= '9') || c == '!' || c == '-' {
continue
}
return false
}
return true
}
func loadData(path string) ([]*Entry, error) {
file, err := os.Open(path)
if err != nil {
return err
return nil, err
}
defer file.Close()
listName := strings.ToUpper(filepath.Base(path))
if !SiteChecker.MatchString(listName) {
return fmt.Errorf("invalid list name: %s", listName)
}
var entries []*Entry
scanner := bufio.NewScanner(file)
lineIdx := 0
for scanner.Scan() {
line := scanner.Text()
lineIdx++
// Remove comments
if idx := strings.Index(line, "#"); idx != -1 {
line = line[:idx]
line = line[:idx] // Remove comments
}
line = strings.TrimSpace(line)
if line == "" {
@@ -196,11 +228,11 @@ func loadData(path string) error {
}
entry, err := parseEntry(line)
if err != nil {
return fmt.Errorf("error in %s at line %d: %v", path, lineIdx, err)
return entries, fmt.Errorf("error in %q at line %d: %w", path, lineIdx, err)
}
refMap[listName] = append(refMap[listName], &entry)
entries = append(entries, &entry)
}
return nil
return entries, nil
}
func parseList(refName string, refList []*Entry) error {
@@ -210,11 +242,11 @@ func parseList(refName string, refList []*Entry) error {
plMap[refName] = pl
}
for _, entry := range refList {
if entry.Type == RuleTypeInclude {
if entry.Type == dlc.RuleTypeInclude {
if len(entry.Affs) != 0 {
return fmt.Errorf("affiliation is not allowed for include:%s", entry.Value)
return fmt.Errorf("affiliation is not allowed for include:%q", entry.Value)
}
inc := &Inclusion{Source: strings.ToUpper(entry.Value)}
inc := &Inclusion{Source: entry.Value}
for _, attr := range entry.Attrs {
if strings.HasPrefix(attr, "-") {
inc.BanAttrs = append(inc.BanAttrs, attr[1:]) // Trim attribute prefix `-` character
@@ -238,24 +270,44 @@ func parseList(refName string, refList []*Entry) error {
return nil
}
func polishList(roughMap *map[string]*Entry) []*Entry {
finalList := make([]*Entry, 0, len(*roughMap))
queuingList := make([]*Entry, 0, len(*roughMap)) // Domain/full entries without attr
func isMatchAttrFilters(entry *Entry, incFilter *Inclusion) bool {
if len(incFilter.MustAttrs) == 0 && len(incFilter.BanAttrs) == 0 {
return true
}
if len(entry.Attrs) == 0 {
return len(incFilter.MustAttrs) == 0
}
for _, m := range incFilter.MustAttrs {
if !slices.Contains(entry.Attrs, m) {
return false
}
}
for _, b := range incFilter.BanAttrs {
if slices.Contains(entry.Attrs, b) {
return false
}
}
return true
}
func polishList(roughMap map[string]*Entry) []*Entry {
finalList := make([]*Entry, 0, len(roughMap))
queuingList := make([]*Entry, 0, len(roughMap)) // Domain/full entries without attr
domainsMap := make(map[string]bool)
for _, entry := range *roughMap {
for _, entry := range roughMap {
switch entry.Type { // Bypass regexp, keyword and "full/domain with attr"
case RuleTypeRegexp:
case dlc.RuleTypeRegexp:
finalList = append(finalList, entry)
case RuleTypeKeyword:
case dlc.RuleTypeKeyword:
finalList = append(finalList, entry)
case RuleTypeDomain:
case dlc.RuleTypeDomain:
domainsMap[entry.Value] = true
if len(entry.Attrs) != 0 {
finalList = append(finalList, entry)
} else {
queuingList = append(queuingList, entry)
}
case RuleTypeFullDomain:
case dlc.RuleTypeFullDomain:
if len(entry.Attrs) != 0 {
finalList = append(finalList, entry)
} else {
@@ -266,12 +318,16 @@ func polishList(roughMap *map[string]*Entry) []*Entry {
// Remove redundant subdomains for full/domain without attr
for _, qentry := range queuingList {
isRedundant := false
pd := qentry.Value // Parent domain
pd := qentry.Value // To be parent domain
if qentry.Type == dlc.RuleTypeFullDomain {
pd = "." + pd // So that `domain:example.org` overrides `full:example.org`
}
for {
idx := strings.Index(pd, ".")
if idx == -1 { break }
if idx == -1 {
break
}
pd = pd[idx+1:] // Go for next parent
if !strings.Contains(pd, ".") { break } // Not allow tld to be a parent
if domainsMap[pd] {
isRedundant = true
break
@@ -289,27 +345,16 @@ func polishList(roughMap *map[string]*Entry) []*Entry {
}
func resolveList(pl *ParsedList) error {
if _, pldone := finalMap[pl.Name]; pldone { return nil }
if _, pldone := finalMap[pl.Name]; pldone {
return nil
}
if cirIncMap[pl.Name] {
return fmt.Errorf("circular inclusion in: %s", pl.Name)
return fmt.Errorf("circular inclusion in: %q", pl.Name)
}
cirIncMap[pl.Name] = true
defer delete(cirIncMap, pl.Name)
isMatchAttrFilters := func(entry *Entry, incFilter *Inclusion) bool {
if len(incFilter.MustAttrs) == 0 && len(incFilter.BanAttrs) == 0 { return true }
if len(entry.Attrs) == 0 { return len(incFilter.MustAttrs) == 0 }
for _, m := range incFilter.MustAttrs {
if !slices.Contains(entry.Attrs, m) { return false }
}
for _, b := range incFilter.BanAttrs {
if slices.Contains(entry.Attrs, b) { return false }
}
return true
}
roughMap := make(map[string]*Entry) // Avoid basic duplicates
for _, dentry := range pl.Entries { // Add direct entries
roughMap[dentry.Plain] = dentry
@@ -317,7 +362,7 @@ func resolveList(pl *ParsedList) error {
for _, inc := range pl.Inclusions {
incPl, exist := plMap[inc.Source]
if !exist {
return fmt.Errorf("list '%s' includes a non-existent list: '%s'", pl.Name, inc.Source)
return fmt.Errorf("list %q includes a non-existent list: %q", pl.Name, inc.Source)
}
if err := resolveList(incPl); err != nil {
return err
@@ -328,67 +373,66 @@ func resolveList(pl *ParsedList) error {
}
}
}
finalMap[pl.Name] = polishList(&roughMap)
finalMap[pl.Name] = polishList(roughMap)
return nil
}
func main() {
flag.Parse()
func run() error {
dir := *dataPath
fmt.Println("Use domain lists in", dir)
fmt.Printf("using domain lists data in %q\n", dir)
// Generate refMap
err := filepath.Walk(dir, func(path string, info os.FileInfo, err error) error {
refMap := make(map[string][]*Entry)
err := filepath.WalkDir(dir, func(path string, d os.DirEntry, err error) error {
if err != nil {
return err
}
if info.IsDir() {
if d.IsDir() {
return nil
}
if err := loadData(path); err != nil {
return err
listName := strings.ToUpper(filepath.Base(path))
if !validateSiteName(listName) {
return fmt.Errorf("invalid list name: %q", listName)
}
return nil
refMap[listName], err = loadData(path)
return err
})
if err != nil {
fmt.Println("Failed to loadData:", err)
os.Exit(1)
return fmt.Errorf("failed to loadData: %w", err)
}
// Generate plMap
for refName, refList := range refMap {
if err := parseList(refName, refList); err != nil {
fmt.Println("Failed to parseList:", err)
os.Exit(1)
return fmt.Errorf("failed to parseList %q: %w", refName, err)
}
}
// Generate finalMap
for _, pl := range plMap {
for plname, pl := range plMap {
if err := resolveList(pl); err != nil {
fmt.Println("Failed to resolveList:", err)
os.Exit(1)
return fmt.Errorf("failed to resolveList %q: %w", plname, err)
}
}
// Create output directory if not exist
if _, err := os.Stat(*outputDir); os.IsNotExist(err) {
if mkErr := os.MkdirAll(*outputDir, 0755); mkErr != nil {
fmt.Println("Failed:", mkErr)
os.Exit(1)
}
// Make sure output directory exists
if err := os.MkdirAll(*outputDir, 0755); err != nil {
return fmt.Errorf("failed to create output directory: %w", err)
}
// Export plaintext list
if *exportLists != "" {
exportedListSlice := strings.Split(*exportLists, ",")
for _, exportedList := range exportedListSlice {
if err := writePlainList(exportedList); err != nil {
fmt.Println("Failed to write list:", err)
for rawEpList := range strings.SplitSeq(*exportLists, ",") {
if epList := strings.TrimSpace(rawEpList); epList != "" {
entries, exist := finalMap[strings.ToUpper(epList)]
if !exist || len(entries) == 0 {
fmt.Printf("list %q does not exist or is empty\n", epList)
continue
}
fmt.Printf("list: '%s' has been generated successfully.\n", exportedList)
if err := writePlainList(epList, entries); err != nil {
fmt.Printf("failed to write list %q: %v\n", epList, err)
continue
}
fmt.Printf("list %q has been generated successfully.\n", epList)
}
}
@@ -397,8 +441,7 @@ func main() {
for siteName, siteEntries := range finalMap {
site, err := makeProtoList(siteName, siteEntries)
if err != nil {
fmt.Println("Failed:", err)
os.Exit(1)
return fmt.Errorf("failed to makeProtoList %q: %w", siteName, err)
}
protoList.Entry = append(protoList.Entry, site)
}
@@ -409,13 +452,19 @@ func main() {
protoBytes, err := proto.Marshal(protoList)
if err != nil {
fmt.Println("Failed to marshal:", err)
os.Exit(1)
return fmt.Errorf("failed to marshal: %w", err)
}
if err := os.WriteFile(filepath.Join(*outputDir, *outputName), protoBytes, 0644); err != nil {
fmt.Println("Failed to write output:", err)
return fmt.Errorf("failed to write output: %w", err)
}
fmt.Printf("%q has been generated successfully.\n", *outputName)
return nil
}
func main() {
flag.Parse()
if err := run(); err != nil {
fmt.Printf("Fatal error: %v\n", err)
os.Exit(1)
} else {
fmt.Println(*outputName, "has been generated successfully.")
}
}