INDEX
Explanations
the word "others" signaling a contrast or difference from a certain group or viewpoint
references to differing opinions or perspectives
New Auto-Interp
Negative Logits
ocracy
-0.66
Accessory
-0.66
Url
-0.64
opoly
-0.64
obar
-0.63
Deal
-0.63
Sac
-0.62
ihara
-0.61
itation
-0.61
ãĥĥ
-0.59
POSITIVE LOGITS
ngth
0.95
cius
0.94
nces
0.77
yout
0.77
indo
0.74
rack
0.74
ellery
0.73
ecided
0.72
igham
0.72
hots
0.72
Activations Density 0.026%