INDEX
Explanations
references to various forms of communication or expression, such as speeches, confessions, and testimonies
terms related to personal narratives and significant actions or events
New Auto-Interp
Negative Logits
iland
-0.76
psc
-0.72
enegger
-0.72
ERO
-0.67
stad
-0.65
£ı
-0.65
mobi
-0.65
react
-0.64
effect
-0.64
Ĭ±
-0.63
POSITIVE LOGITS
worn
0.77
holder
0.69
belonging
0.69
bestowed
0.68
among
0.67
Coil
0.64
lessness
0.64
exercised
0.64
uttered
0.63
whistleblowers
0.60
Activations Density 0.563%