libxo question

Kristof Provost kp at FreeBSD.org
Fri Dec 28 20:19:14 UTC 2018


On 28 Dec 2018, at 20:31, Mark Saad wrote:
> All
>   I am playing around with procstat and libxo on 12-STABLE from
> yesterday . I wanted to get a list of  thread_id's for some processes.
> I wrote a quick python script to grab the data but xml output is not
> well formed. Here is my sample script , which should work on python
> 2.7
>
> ----8<-----------------------
>   1 import subprocess as sp
>   2 import os,sys
>   3 import pprint as pp
>   4 import xml.etree.cElementTree as ET
>   5
>   6
>   7 FNULL = open(os.devnull, 'w')
>   8 cmd = "procstat --libxo xml -ta"
>   9 p = sp.Popen(cmd, shell=True, stdout=sp.PIPE,stderr=FNULL,
> executable="/bin/sh")
>  10 text , err = p.communicate()
>  11
>  12 root = ET.fromstring(text)
>  13
>  14 pp.pprint(root)
>  15
>  16 sys.exit(1)
> ------------>8-----------------------
>
> I am constantly getting this odd issue about the xml being not well 
> formatted
>
> Traceback (most recent call last):
>   File "/tmp/test.py", line 12, in <module>
>     root = ET.fromstring(text)
>   File "<string>", line 124, in XML
> cElementTree.ParseError: not well-formed (invalid token): line 1, 
> column 32
>
> Attached is a copy of the xml.   Any guidance would be helpful.
>
The attachment seems to have been eaten by a grue, but I can trivially 
reproduce the problem.
Passing the output of `procstat --libxo xml -ta` to xmllint gives us:

	-:1: parser error : StartTag: invalid element name
	<procstat 
version="1"><threads><0><process_id>0</process_id><command>kernel</com

The libxo code doesn’t quite cope with some of the subtle differences 
between JSON and XML. In this case, that XML tag names must start with a 
letter or an underscore. They may contain numbers, but may not start 
with them.

I’ve used the following very quick&dirty patch to make xmllint happy:

	diff --git a/usr.bin/procstat/procstat.c b/usr.bin/procstat/procstat.c
	index 0269d3c5a5f..5c042322e83 100644
	--- a/usr.bin/procstat/procstat.c
	+++ b/usr.bin/procstat/procstat.c
	@@ -152,7 +152,7 @@ procstat(const struct procstat_cmd *cmd, struct 
procstat *prstat,
	 {
	        char *pidstr = NULL;

	-       asprintf(&pidstr, "%d", kipp->ki_pid);
	+       asprintf(&pidstr, "pid_%d", kipp->ki_pid);
	        if (pidstr == NULL)
	                xo_errc(1, ENOMEM, "Failed to allocate memory in 
procstat()");
	        xo_open_container(pidstr);
	diff --git a/usr.bin/procstat/procstat_rusage.c 
b/usr.bin/procstat/procstat_rusage.c
	index 3d8c76370c0..f9caef49a2f 100644
	--- a/usr.bin/procstat/procstat_rusage.c
	+++ b/usr.bin/procstat/procstat_rusage.c
	@@ -126,7 +126,7 @@ print_rusage(struct kinfo_proc *kipp)
	            format_time(&kipp->ki_rusage.ru_stime));

	        if ((procstat_opts & PS_OPT_PERTHREAD) != 0) {
	-               asprintf(&threadid, "%d", kipp->ki_tid);
	+               asprintf(&threadid, "ID_%d", kipp->ki_tid);
	                if (threadid == NULL)
	                        xo_errc(1, ENOMEM,
	                            "Failed to allocate memory in 
print_rusage()");
	diff --git a/usr.bin/procstat/procstat_sigs.c 
b/usr.bin/procstat/procstat_sigs.c
	index 984d5d57f95..ceb36ca0dcb 100644
	--- a/usr.bin/procstat/procstat_sigs.c
	+++ b/usr.bin/procstat/procstat_sigs.c
	@@ -155,7 +155,7 @@ procstat_threads_sigs(struct procstat *procstat, 
struct kinfo_proc *kipp)
	        kinfo_proc_sort(kip, count);
	        for (i = 0; i < count; i++) {
	                kipp = &kip[i];
	-               asprintf(&threadid, "%d", kipp->ki_tid);
	+               asprintf(&threadid, "ID_%d", kipp->ki_tid);
	                if (threadid == NULL)
	                        xo_errc(1, ENOMEM, "Failed to allocate memory 
in "
	                            "procstat_threads_sigs()");
	diff --git a/usr.bin/procstat/procstat_threads.c 
b/usr.bin/procstat/procstat_threads.c
	index c62bb516175..17f11044021 100644
	--- a/usr.bin/procstat/procstat_threads.c
	+++ b/usr.bin/procstat/procstat_threads.c
	@@ -66,7 +66,7 @@ procstat_threads(struct procstat *procstat, struct 
kinfo_proc *kipp)
	        kinfo_proc_sort(kip, count);
	        for (i = 0; i < count; i++) {
	                kipp = &kip[i];
	-               asprintf(&threadid, "%d", kipp->ki_tid);
	+               asprintf(&threadid, "ID_%d", kipp->ki_tid);
	                if (threadid == NULL)
	                        xo_errc(1, ENOMEM, "Failed to allocate memory 
in "
	                            "procstat_threads()");

It’s probably not the prettiest XML, and I’m not sure how useful the 
tags are now, but arguably tags with dynamic names are a bad idea 
anyway.
I think you wouldn’t see this problem with JSON, so perhaps that’s a 
workaround you can consider as well.

Regards,
Kristof


More information about the freebsd-hackers mailing list