INDEX
Explanations
items of critique or controversy
New Auto-Interp
Negative Logits
worldly
-0.72
,
-0.68
ome
-0.66
":"/
-0.66
ally
-0.65
agine
-0.62
sth
-0.61
entimes
-0.60
ocry
-0.58
azing
-0.57
POSITIVE LOGITS
etc
0.93
LLC
0.90
huh
0.86
aka
0.85
Inc
0.81
supra
0.80
ISBN
0.76
000
0.75
Calif
0.73
which
0.73
Activations Density 2.777%