INDEX
Explanations
references to the concept of reality
mentions of the word "reality."
New Auto-Interp
Negative Logits
indal
-0.81
asus
-0.81
asso
-0.80
edo
-0.79
artney
-0.79
incinn
-0.78
edin
-0.76
cair
-0.73
raph
-0.71
ergy
-0.71
POSITIVE LOGITS
psons
0.94
istically
0.93
ignment
0.86
reality
0.80
reality
0.78
srfAttach
0.75
distortion
0.73
fulness
0.73
ually
0.73
Ens
0.72
Activations Density 0.032%