INDEX
Explanations
ideas or opinions expressed by individuals
phrases expressing skepticism or criticism
New Auto-Interp
Negative Logits
byss
-0.73
ãĥŁ
-0.68
Escape
-0.62
ouses
-0.57
indoors
-0.56
nearby
-0.55
ãĤ¨ãĥ«
-0.55
Kinnikuman
-0.55
Adren
-0.55
irtual
-0.55
POSITIVE LOGITS
[
0.99
miscon
0.93
rhetorical
0.85
..."
0.84
â̦"
0.83
Senator
0.83
rhetoric
0.83
polit
0.81
disingen
0.81
bipartisan
0.79
Activations Density 1.016%