INDEX
Explanations
names of specific individuals, especially professionals or experts in their fields
statements made by authorities or experts
New Auto-Interp
Negative Logits
clubhouse
-0.76
tumblr
-0.73
canoe
-0.69
manifesto
-0.60
tune
-0.59
robbers
-0.58
conquering
-0.57
yacht
-0.57
partying
-0.57
deduction
-0.57
POSITIVE LOGITS
ovich
0.97
ansky
0.88
inski
0.87
(@
0.85
mann
0.84
auer
0.84
enberg
0.79
uria
0.79
gaard
0.78
inger
0.78
Activations Density 0.562%