A good way to test your intuitions about how the stats works is to run simulations.
The basics
You'll need to generate random numbers: rnorm, rbin, rpois, will all come in handy.
Some examples
Between-subjects data: overlapping 95% CIs need not imply no significant difference
Why you can't use 95% CIs for within-subject data
Example showing how adding random intercepts can improve fit
Generating the null distribution of Pearson's r for various sample sizes
Simulate a series of correlational studies each of which has n participants where there is no correlation between x and y, i.e. gives an estimate of the null sampling distribution of r.
studies = 1000
n = 50
r1 = c()
for (i in 1:studies) {
x = rnorm(n,0,1)
y = rnorm(n,0,1)
r1[i] = cor(x,y)
}
n = 200
r2 = c()
for (i in 1:studies) {
x = rnorm(n,0,1)
y = rnorm(n,0,1)
r2[i] = cor(x,y)
}
par(mfrow = c(1,2))
hist(r1, main="n = 50", col = "lightblue",xlim=c(-.7,.7), xlab = "r")
hist(r2, main="n = 200", col = "lightblue",xlim=c(-.7,.7), xlab = "r")
par(mfrow = c(1,1))
Picture below:

Generating data with different means and variances
# covariate values
x <- runif(100, -5, 5)
# Example 1
# model for variance
sds <- 0.5 + abs(x) * 2
# model for responses
y <- 2 + 3 * x + rnorm(100, sd = sds)
fm <- lm(y ~ x)
plot(fm, 1)
plot(fm, 3)
# Example 2
sds <- 1 + (abs(min(x)) + x) * 2
# model for responses
y <- 2 + 3 * x + rnorm(100, sd = sds)
fm <- lm(y ~ x)
plot(fm, 1)
plot(fm, 3)
Thank you to Dimitris Rizopoulos
Simulating mixture distributions
Here we simulate an experiment with two (between-subject) conditions. In both cases, there's a mixture distribution (intuitively there are two groups: very fast responses and slightly slower, perhaps caused by different systems), however in the second (represented by x2), the upper distribution is shifted upwards.
x1a = rnorm(2000,100,5)
x1b = rnorm(5000,110,5)
x1 = c(x1a,x1b)
x2a = rnorm(2000,100,5)
x2b = rnorm(5000,115,5)
x2 = c(x2a,x2b)
par(mfrow =c(1,2))
plot(density(x1),main="", xlab = "RT")
lines(density(x2), lty=2)
plot(density(x1b),main="", xlab = "RT")
lines(density(x2b), lty=2)
par(mfrow = c(1,1))
t.test(x1,x2)
t.test(x1b,x2b)
The t-test on the mixture gives a significant result, but note the means are skewed because of the presence of the lower distro. (Mixture difference: 107 vs 111; pure effect difference: 110 vs. 115)
The density plot below shows both the mixture distributions and the pure effect bit of the distribution.

Positive correlation isn't transitive
Let U, V, and W be independent random variables. Define X = U+V, Y = V+W, and Z = W-U. Then the correlation between X and Y is positive, Y and Z is positive, but the correlation between X and Z is negative. (See Langford, E.; Schwertman, N. & Owens, M. (2001) [Is the Property of Being Positively Correlated Transitive? The American Statistician, 55, 322-325.])
U = rnorm(100,0,1)
V = rnorm(100,0,1)
W = rnorm(100,0,1)
X = U+V
Y = V+W
Z = W-U
cor.test(X,Y)
cor.test(Y,Z)
cor.test(X,Z)
par(mfrow = c(2,2))
plot(Y~X,xlim=c(-5,5),ylim=c(-5,5))
abline(lm(Y~X),col="red")
plot(Y~Z,xlim=c(-5,5),ylim=c(-5,5))
abline(lm(Y~Z),col="red")
plot(Z~X,xlim=c(-5,5),ylim=c(-5,5))
abline(lm(Z~X),col="red")
par(mfrow = c(1,1))

Median splits are a bad idea…
> x = rnorm(100,100,15)
> y = 2*x + 5 + rnorm(100,0,35)
>
> summary(lm(y ~ x))
Estimate Std. Error t value Pr(>|t|)
(Intercept) 22.442 21.503 1.04 0.3
x 1.885 0.212 8.88 3.3e-14 ***
Multiple R-squared: 0.446, Adjusted R-squared: 0.44
F-statistic: 78.8 on 1 and 98 DF, p-value: 3.32e-14
> summary(lm(y ~ cut(x,2)))
Estimate Std. Error t value Pr(>|t|)
(Intercept) 178.45 6.58 27.12 < 2e-16 ***
cut(x, 2)(94.2,135] 49.26 8.10 6.08 2.3e-08 ***
Multiple R-squared: 0.274, Adjusted R-squared: 0.267
F-statistic: 37 on 1 and 98 DF, p-value: 2.30e-08
(Note the R-squares and the t's and the p's.)
Sums of uniformly distributed random variables
See here for the background.
x1 = runif(1000,1,6)
x2 = runif(1000,1,6)
x3 = runif(1000,1,6)
x4 = runif(1000,1,6)
x5 = runif(1000,1,6)
x6 = runif(1000,1,6)
x7 = runif(1000,1,6)
x8 = runif(1000,1,6)
x9 = runif(1000,1,6)
par(mfrow=c(3,3))
hist(x1, main="",xlab="x")
hist(x1+x2, main="",xlab="x")
hist(x1+x2+x3, main="",xlab="x")
hist(x1+x2+x3+x4, main="",xlab="x")
hist(x1+x2+x3+x4+x5, main="",xlab="x")
hist(x1+x2+x3+x4+x5+x6, main="",xlab="x")
hist(x1+x2+x3+x4+x5+x6+x7, main="",xlab="x")
hist(x1+x2+x3+x4+x5+x6+x7+x8, main="",xlab="x")
hist(x1+x2+x3+x4+x5+x6+x7+x8+x9, main="",xlab="x")
par(mfrow=c(1,1))
