INDEX
Explanations
adjectives and verbs related to an evaluation or opinion of something
assertions about significant truths or principles
New Auto-Interp
Negative Logits
itures
-0.81
misc
-0.71
ideon
-0.68
aroo
-0.68
rams
-0.67
Delivery
-0.67
ramids
-0.65
ersion
-0.64
rn
-0.62
..............
-0.62
POSITIVE LOGITS
bothering
0.85
overlooked
0.83
distinguishing
0.78
noticeable
0.77
highlights
0.76
distinguishes
0.75
borne
0.75
unnoticed
0.75
oken
0.75
bothered
0.73
Activations Density 0.216%