Skip to content

Commit 96d2fce

Browse files
committed
0.7.26 - improved wordlists.
1 parent cf67ec4 commit 96d2fce

9 files changed

+112
-7
lines changed

Cargo.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[package]
22
name = "rustrict"
33
authors = ["Finn Bear"]
4-
version = "0.7.25"
4+
version = "0.7.26"
55
edition = "2021"
66
license = "MIT OR Apache-2.0"
77
repository = "https://github.com/finnbear/rustrict/"

README.md

+2-3
Original file line numberDiff line numberDiff line change
@@ -132,8 +132,7 @@ If you want to add custom profanities or safe words, enable the `customize` feat
132132
}
133133
```
134134

135-
But wait, there's more! If your use-case is chat moderation, and you can store data on a per-user basis, you
136-
might benefit from the `context` feature.
135+
If your use-case is chat moderation, and you store data on a per-user basis, you can use `rustrict::Context` as a reference implementation:
137136

138137
```rust
139138
#[cfg(feature = "context")]
@@ -178,7 +177,7 @@ is used as a dataset. Positive accuracy is the percentage of profanity detected
178177

179178
| Crate | Accuracy | Positive Accuracy | Negative Accuracy | Time |
180179
|-------|----------|-------------------|-------------------|------|
181-
| [rustrict](https://crates.io/crates/rustrict) | 79.74% | 94.00% | 76.18% | 9s |
180+
| [rustrict](https://crates.io/crates/rustrict) | 79.74% | 94.00% | 76.19% | 9s |
182181
| [censor](https://crates.io/crates/censor) | 76.16% | 72.76% | 77.01% | 23s |
183182

184183
## Development

src/character_analyzer.rs

+1
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ fn main() {
4949
'🐿' => 20,
5050
'𒐫' => 40,
5151
'𒈙' => 35,
52+
'༺' | '༻' => 25,
5253
_ => {
5354
let max_width = (max_width(c, &fonts) as f32 / 100f32).round() as u16;
5455
if max_width > u8::MAX as u16 {

src/character_widths.bin

8 Bytes
Binary file not shown.

src/context.rs

+4
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,10 @@ use std::time::{Duration, Instant};
88

99
/// Context is useful for taking moderation actions on a per-user basis i.e. each user would get
1010
/// their own Context.
11+
///
12+
/// # Recommendation
13+
///
14+
/// Use this as a reference implementation e.g. by copying and adapting it.
1115
#[derive(Clone)]
1216
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
1317
#[cfg_attr(doc, doc(cfg(feature = "context")))]

src/dictionary_extra.txt

+11
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,21 @@
11
#8
22
# of
3+
(until
34
2 secs
45
3 secs
56
4 secs
7+
45s
68
5 secs
79
6 secs
810
7 secs
911
8 secs
1012
88
1113
9 secs
14+
9 is still
1215
99
1316
0 secs
17+
300 bot
18+
600 bot
1419
twinkie
1520
two secs
1621
three secs
@@ -22,6 +27,7 @@ eight secs
2227
nine secs
2328
ten secs
2429
aboutit
30+
admit it's
2531
ain't it
2632
alt
2733
an ai
@@ -78,6 +84,7 @@ few secs
7884
ffa game
7985
fire cracker
8086
fire crackers
87+
forgot it's
8188
francoitalian
8289
franco italian
8390
freakin
@@ -101,6 +108,7 @@ hellen
101108
hellp
102109
h on keyboard
103110
h tier
111+
hi @Bla
104112
hi tirp
105113
ho ho ho
106114
honkeytonk
@@ -184,6 +192,7 @@ pp. 9
184192
pussinboots
185193
puss in boots
186194
ref'd
195+
refresh at
187196
rip
188197
saturated fat
189198
shoehorn your
@@ -197,6 +206,7 @@ suicide squad
197206
superbowlxxx
198207
tally ho
199208
tally-ho
209+
tea the
200210
test test test
201211
then i guess
202212
then talk
@@ -229,6 +239,7 @@ virgin islands
229239
wassup
230240
wasn't it
231241
wouldn't it
242+
xp or no
232243
yass
233244
yesturday
234245
zenga

src/false_positives.txt

+24
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,18 @@
11
# of
22
#8
3+
(until
34
0 secs
45
2 secs
56
3 secs
7+
300 bot
68
4 secs
9+
45s
710
5 secs
811
6 secs
12+
600 bot
913
7 secs
1014
8 secs
15+
9 is still
1116
9 secs
1217
a analog
1318
a analyse
@@ -147,6 +152,7 @@ adipex nissan
147152
adipex pee
148153
adipex rated
149154
adiposogenital
155+
admit it's
150156
ado lif
151157
adramelech
152158
adrammelech
@@ -2749,6 +2755,14 @@ bol lock
27492755
bol locks
27502756
bol look
27512757
bol looks
2758+
bomb china
2759+
bomb india
2760+
bomb iran
2761+
bomb israel
2762+
bomb palestine
2763+
bomb russia
2764+
bomb ukraine
2765+
bomb usage
27522766
bon ed
27532767
bon eric
27542768
bon erik
@@ -6863,6 +6877,7 @@ fore skin
68636877
forebreast
68646878
forget lost
68656879
forget married
6880+
forgot it's
68666881
fork cocktail
68676882
fork commission
68686883
fork cook
@@ -7800,6 +7815,7 @@ heterosex
78007815
heterotic
78017816
hexadic
78027817
hexanal
7818+
hi @Bla
78037819
hi little
78047820
hi tier
78057821
hi tile
@@ -7822,6 +7838,7 @@ highs perm
78227838
highs seeks
78237839
hilar
78247840
hildebrandic
7841+
hill hitting
78257842
hill illus
78267843
hill iv
78277844
hill ju
@@ -9654,6 +9671,7 @@ junk until
96549671
junk untitled
96559672
junk unto
96569673
jurisprude
9674+
just cumulative
96579675
justments cumulative
96589676
justments ext
96599677
justments hilt
@@ -9998,6 +10016,7 @@ kill twelve
999810016
kill twenty
999910017
kill twi
1000010018
kill ty
10019+
killed yourself
1000110020
killian
1000210021
killing jewel
1000310022
killing palestinian
@@ -10351,6 +10370,7 @@ less blin
1035110370
less bo
1035210371
lets cumulative
1035310372
lets ext
10373+
lets fake
1035410374
lets hilt
1035510375
lets hit
1035610376
lets lut
@@ -13515,6 +13535,7 @@ plumbaginaceous
1351513535
plumbum
1351613536
plumigerous
1351713537
plzz
13538+
pmsg
1351813539
pn lips
1351913540
pn nigeria
1352013541
pnigerophobia
@@ -15134,6 +15155,7 @@ res perm
1513415155
res seeks
1513515156
resex
1513615157
resh aging
15158+
resh at
1513715159
resh hilt
1513815160
resh hit
1513915161
resh it
@@ -17647,6 +17669,7 @@ tch linking
1764717669
tch links
1764817670
tch little
1764917671
tchincou
17672+
tea the
1765017673
teanal
1765117674
teapottykin
1765217675
teataster
@@ -19880,6 +19903,7 @@ xnxx until
1988019903
xnxx untitled
1988119904
xnxx unto
1988219905
xnxx vie
19906+
xp or no
1988319907
ya holes
1988419908
yacht its
1988519909
yacht texts

0 commit comments

Comments
 (0)