INDEX
Explanations
phrases related to political and social issues
New Auto-Interp
Negative Logits
gian
-0.73
wine
-0.64
edly
-0.63
raid
-0.61
iced
-0.60
#$
-0.60
akia
-0.59
anza
-0.58
SHIP
-0.57
cloth
-0.55
POSITIVE LOGITS
regards
1.21
regard
1.02
terms
0.94
spite
0.86
nutshell
0.83
lieu
0.81
relation
0.80
perpet
0.80
clus
0.80
teasp
0.79
Activations Density 0.462%