INDEX
Explanations
contractions of 'is' and 'not'
negations or phrases indicating the absence of something
New Auto-Interp
Negative Logits
PU
-0.70
Mechdragon
-0.70
dust
-0.69
mark
-0.66
couch
-0.65
flies
-0.63
retard
-0.62
compliment
-0.62
complete
-0.62
nearest
-0.61
POSITIVE LOGITS
't
1.57
iting
1.05
ÃŃ
1.04
ited
1.03
itely
1.01
uts
0.96
eness
0.95
cest
0.94
its
0.93
atically
0.92
Activations Density 0.105%