INDEX
Explanations
mentions of the term "casual" in various contexts
New Auto-Interp
Negative Logits
elts
-0.16
blr
-0.16
/goto
-0.15
olar
-0.15
ater
-0.15
aska
-0.14
ollen
-0.14
awai
-0.14
syn
-0.14
aters
-0.14
POSITIVE LOGITS
urus
0.15
nes
0.14
OOD
0.14
beit
0.14
therapy
0.14
é³
0.14
ãĤŃãĥ³ãĤ°
0.14
Becker
0.14
ariat
0.14
affair
0.14
Activations Density 0.009%