INDEX
Explanations
references to cheering or expressions of support and encouragement
New Auto-Interp
Negative Logits
ngth
-0.73
amation
-0.66
è£ħ
-0.63
nesota
-0.63
fung
-0.58
ahime
-0.58
consulted
-0.58
Purg
-0.57
uture
-0.57
arin
-0.57
POSITIVE LOGITS
leaders
1.47
leader
1.44
leading
1.36
lead
1.19
fulness
1.04
cheer
1.02
fully
0.95
wart
0.89
hovah
0.89
sticks
0.88
Activations Density 0.005%