INDEX
Explanations
names or terms related to specific individuals
the occurrence of the term "worth" and its variants in various contexts
New Auto-Interp
Negative Logits
deaf
-0.69
subtitle
-0.64
awoken
-0.63
cerebral
-0.62
heter
-0.61
Engineer
-0.60
Nou
-0.59
repressive
-0.59
cynical
-0.59
typo
-0.58
POSITIVE LOGITS
worth
1.53
sburg
1.13
sworth
0.99
pole
0.96
iness
0.87
bye
0.86
emouth
0.85
ocity
0.84
sburgh
0.84
nton
0.83
Activations Density 0.006%