INDEX
Explanations
mentions of computing or technology, especially related to bots or digital threats
New Auto-Interp
Negative Logits
mble
-0.80
mund
-0.73
pact
-0.65
McCartney
-0.62
ity
-0.62
vid
-0.62
egal
-0.61
â̦â̦â̦â̦â̦â̦â̦â̦
-0.61
VEN
-0.60
ities
-0.60
POSITIVE LOGITS
anical
1.33
anie
1.02
wana
1.01
anooga
0.97
cham
0.97
zeb
0.94
anic
0.94
ania
0.94
any
0.90
herer
0.89
Activations Density 0.198%