INDEX
Explanations
the word "monolog" or similar variations with different levels of activation
words related to various fields of study, particularly focusing on aspects of "logic" and "analogies."
New Auto-Interp
Negative Logits
plan
-0.63
noon
-0.62
uncond
-0.60
words
-0.60
EAR
-0.60
eff
-0.59
FACE
-0.58
FUL
-0.58
affe
-0.58
kins
-0.57
POSITIVE LOGITS
ues
1.78
raphic
1.24
uing
1.20
raph
1.18
rams
1.16
ued
1.16
ous
1.10
etics
1.07
ical
1.06
etic
1.06
Activations Density 0.104%