INDEX
Explanations
references to schools and educational settings
New Auto-Interp
Negative Logits
CHAT
-0.65
onite
-0.62
tein
-0.61
odox
-0.61
anan
-0.60
Rebellion
-0.59
ILA
-0.58
ilogy
-0.58
teness
-0.58
theless
-0.57
POSITIVE LOGITS
chool
1.27
hops
1.16
hips
1.12
paces
1.10
nationwide
1.04
frequ
0.97
afety
0.94
everywhere
0.92
creen
0.91
alike
0.90
Activations Density 0.199%