INDEX
Explanations
names and specific terms related to political and social figures
references to specific individuals or entities in various contexts
New Auto-Interp
Negative Logits
;;
-0.77
igate
-0.67
;}
-0.64
";
-0.62
ORK
-0.62
};
-0.61
.;
-0.61
Accessed
-0.61
hart
-0.60
estern
-0.60
POSITIVE LOGITS
nonetheless
1.78
nevertheless
1.60
hasn
1.23
persists
1.20
etheless
1.17
remains
1.16
insists
1.16
still
1.15
doesn
1.14
remained
1.14
Activations Density 0.694%