Linux平台下的线程标识

本文首发于我的公众号：码农手札，主要介绍linux下c++开发的知识包括网络编程的知识同时也会介绍一些有趣的算法题，欢迎大家关注，利用碎片时间学习一些编程知识，冰冻三尺非一日之寒，让我们一起加油！

前言

最近在看深入理解Linux内核，不得不感慨Linux内核真是博大精深，这本书读起来很吃力，细节方面的东西很多，读起来需要有所取舍，不过今天的重点不是这个，今天的重点是Linux平台的线程标识

线程or进程

在Linux系统下，实际上我们创建的线程就是进程，为什么这样说呢，因为线程创建最后走的系统调用也是clone，不过不同的地方在于创建线程的时候，线程是会共享地址空间等其他资源的，因此线程可以认为是轻量级的进程，但是在内核眼里并无线程进程之分，都是由task_struct来标识的一个任务

gettid or pthread_self

POSIX threads库特地提供了pthread_self函数用于返回当前线程的标识符，类型是pthread_t。不过一个重要的问题是pthread_t并不一定是一个树值类型（整数或者指针），也有可能是一个结构体，因此Pthreads特地还提供了pthread_equal函数来比较两个pthread_t类型是否相等。pthread_t类型不确定就带来了一个很大的问题

无法打印pthread_t，因此其类型不确定
无法直接比较pthread_t的大小或者计算其hash值，这个导致我们没办法将pthread_t存到stl的关联容器中
无法定义一个非法的pthread_t值来表示一个错误或者非法的线程id，这个就没法判断一个pthread_t是否合法了
pthread_t只保证在当前时刻当前进程内唯一，所以跨进程毫无意义，也因此不能用作全局

接下来介绍gettid，这个函数实际上是推荐调用的函数，为什么这么说呢，我们先从原理的角度来介绍

在第一部分，我们已经知道了实际上在Linux内核中线程实际上也占用一个task_struct，这也就导致了实际上每个线程都拥有一个唯一的tid，这个tid就是gettid函数所返回的，（注意：这里不得不提一下getpid这个函数，由于规范要求对于属于同一个进程的多个线程调用getpid都返回相同的值，所以实际上对于Linux下的线程调用gettid返回的实际上是一个tgid，这个tgid实际上就是主线程的tid是相同的）

知道了上面所说的，我们就可以理解为什么gettid相比pthread_self更适合了：

gettid返回的类型是pid_t，这个和getpid返回的类型是一样的，通常是一个小整数
在Linux系统中，它直接标示了内核的任务调度id，实际上也就是唯一的tid，因此在/proc中也可以轻易找到，其实用top命令也可以看到，不过需要在输入top命令之后再按H才会显示线程的信息
任何时刻是全局唯一的，这个其实很好理解，因为线程是对内核可见的，所以必然拥有全局唯一的tid
0是一个非法值，原因也很简单，tid为0的进程只在操作系统初始化的时候出现，当OS初始化完毕之后实际上就不存在tid为0的线程了，只剩下init进程（pid为1）在运行

测试代码及结果

#include <cstdlib>
#include <iostream>
#include <pthread.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>

#define gettid() syscall(SYS_gettid)

using std::cout;
using std::endl;

const int THREADNUMS = 5;

void *thread_func(void *data) {
  auto num = (long)(data);
  cout << endl;
  cout << "Hello world thread num:" << num << endl;
  pid_t cur_tid = gettid();
  cout << "current thread gettid is:" << cur_tid << endl;
  pthread_t thread_self = pthread_self();
  cout << "current thread thread_self is:" << thread_self << endl;
  pid_t cur_pid = getpid();
  cout << "current thread getpid is:" << cur_pid << endl;
  cout << endl;
  pthread_exit(nullptr);
}

int main(int argc, char const *argv[]) {
  pid_t cur_tid = gettid();
  cout << "main thread gettid is:" << cur_tid << endl;
  pthread_t thread_self = pthread_self();
  cout << "main thread thread_self is:" << thread_self << endl;
  pid_t cur_pid = getpid();
  cout << "main thread getpid is:" << cur_tid << endl;
  cout << endl;
  pthread_t threads[THREADNUMS];
  for (int i = 0; i < THREADNUMS; ++i) {
    auto ret = pthread_create(&threads[i], nullptr, thread_func, (void *)(i));
    if (ret != 0) {
      cout << "failed to create pthread_create error:" << strerror(ret) << endl;
      abort();
    }
    ret = pthread_join(threads[i], nullptr);
    if (ret != 0) {
      cout << "failed to call pthread_join error:" << strerror(ret) << endl;
      abort();
    }
  }
  pthread_exit(nullptr);
  return 0;
}

输出：

main thread gettid is:11558
main thread thread_self is:140322486634304
main thread getpid is:11558


Hello world thread num:0
current thread gettid is:11559
current thread thread_self is:140322469283584
current thread getpid is:11558


Hello world thread num:1
current thread gettid is:11560
current thread thread_self is:140322469283584
current thread getpid is:11558


Hello world thread num:2
current thread gettid is:11561
current thread thread_self is:140322469283584
current thread getpid is:11558


Hello world thread num:3
current thread gettid is:11562
current thread thread_self is:140322469283584
current thread getpid is:11558


Hello world thread num:4
current thread gettid is:11563
current thread thread_self is:140322469283584
current thread getpid is:11558

分析：可以看到输出的结果印证了我们的结论，在这里我等待每个线程结束再开始下一个线程，可以发现实际上pthread_self得到的结果是被复用了的，不过在这里pthread_self的定义是unsigned long，因此可以直接被打印出来，但是这个完全不具有可移植性，仅仅在我测试的Ubuntu16.04环境下是这样的，而tid可以看到是不断递增的，之前的线程虽然已经结束，但是新的线程并没有复用前面结束线程的tid

总结

其实这个文章并不难理解，主要是需要理解一个概念，在Linux系统下线程和进程对内核来说都是可见的，线程只是和其他线程共享了资源的“进程”而已，理解了这个，很多操作系统上的东西对我们来说就更容易明白