INDEX
Explanations
direct speech statements made by individuals
quotations and reported speech
New Auto-Interp
Negative Logits
stub
-0.57
harvest
-0.54
woodland
-0.53
fucking
-0.52
tumblr
-0.52
nig
-0.51
pneumonia
-0.51
broom
-0.51
mirac
-0.50
mole
-0.49
POSITIVE LOGITS
aturday
0.61
uer
0.59
cients
0.59
inguished
0.59
ukong
0.58
enegger
0.57
iHUD
0.56
region
0.56
bnb
0.56
emi
0.56
Activations Density 0.631%