INDEX
Explanations
phrases related to reasonable or logical thinking
references to rationality or common sense in discussions about people and their behaviors
New Auto-Interp
Negative Logits
Reloaded
-0.69
Anniversary
-0.66
disappearance
-0.65
Regions
-0.62
Territories
-0.61
Promise
-0.60
apeake
-0.60
Siege
-0.60
rehearsal
-0.59
forestation
-0.59
POSITIVE LOGITS
inclined
0.85
think
0.83
kie
0.80
ellectual
0.80
thinking
0.79
tarian
0.78
iuses
0.78
ascript
0.77
instinctively
0.77
enance
0.77
Activations Density 0.377%