I think this part of the clone(2) man page may clear up the difference re. the PID:
CLONE_THREAD (since Linux 2.4.0-test8)
If CLONE_THREAD is set, the child is placed in the same thread group as the calling process.
Thread groups were a feature added in Linux 2.4 to support the POSIX threads notion of a set of threads that share a single PID. Internally, this shared PID is the so-called thread group identifier (TGID) for the thread group. Since Linux 2.4, calls to getpid(2) return the TGID of the caller.
The "threads are implemented as processes" phrase may refer to the issue of threads having had different PID's in the past. Basically, Linux originally didn't have threads within a process, just separate processes (with separate PIDs) that might have happened to have some shared resources, like virtual memory or file descriptors. CLONE_THREAD and the separation of process ID(*) and thread ID make the Linux behaviour look more like other systems, even though technically the OS still doesn't have separate implementations for threads and processes.
There may have been some issues about old Linux handling signals in a non-standard way because of the non-standard thread implementation, too.
As noted in the comments, Linux 2.4 was also released in 2001.