libxo question
Kristof Provost
kp at FreeBSD.org
Fri Dec 28 20:19:14 UTC 2018
On 28 Dec 2018, at 20:31, Mark Saad wrote:
> All
> I am playing around with procstat and libxo on 12-STABLE from
> yesterday . I wanted to get a list of thread_id's for some processes.
> I wrote a quick python script to grab the data but xml output is not
> well formed. Here is my sample script , which should work on python
> 2.7
>
> ----8<-----------------------
> 1 import subprocess as sp
> 2 import os,sys
> 3 import pprint as pp
> 4 import xml.etree.cElementTree as ET
> 5
> 6
> 7 FNULL = open(os.devnull, 'w')
> 8 cmd = "procstat --libxo xml -ta"
> 9 p = sp.Popen(cmd, shell=True, stdout=sp.PIPE,stderr=FNULL,
> executable="/bin/sh")
> 10 text , err = p.communicate()
> 11
> 12 root = ET.fromstring(text)
> 13
> 14 pp.pprint(root)
> 15
> 16 sys.exit(1)
> ------------>8-----------------------
>
> I am constantly getting this odd issue about the xml being not well
> formatted
>
> Traceback (most recent call last):
> File "/tmp/test.py", line 12, in <module>
> root = ET.fromstring(text)
> File "<string>", line 124, in XML
> cElementTree.ParseError: not well-formed (invalid token): line 1,
> column 32
>
> Attached is a copy of the xml. Any guidance would be helpful.
>
The attachment seems to have been eaten by a grue, but I can trivially
reproduce the problem.
Passing the output of `procstat --libxo xml -ta` to xmllint gives us:
-:1: parser error : StartTag: invalid element name
<procstat
version="1"><threads><0><process_id>0</process_id><command>kernel</com
The libxo code doesn’t quite cope with some of the subtle differences
between JSON and XML. In this case, that XML tag names must start with a
letter or an underscore. They may contain numbers, but may not start
with them.
I’ve used the following very quick&dirty patch to make xmllint happy:
diff --git a/usr.bin/procstat/procstat.c b/usr.bin/procstat/procstat.c
index 0269d3c5a5f..5c042322e83 100644
--- a/usr.bin/procstat/procstat.c
+++ b/usr.bin/procstat/procstat.c
@@ -152,7 +152,7 @@ procstat(const struct procstat_cmd *cmd, struct
procstat *prstat,
{
char *pidstr = NULL;
- asprintf(&pidstr, "%d", kipp->ki_pid);
+ asprintf(&pidstr, "pid_%d", kipp->ki_pid);
if (pidstr == NULL)
xo_errc(1, ENOMEM, "Failed to allocate memory in
procstat()");
xo_open_container(pidstr);
diff --git a/usr.bin/procstat/procstat_rusage.c
b/usr.bin/procstat/procstat_rusage.c
index 3d8c76370c0..f9caef49a2f 100644
--- a/usr.bin/procstat/procstat_rusage.c
+++ b/usr.bin/procstat/procstat_rusage.c
@@ -126,7 +126,7 @@ print_rusage(struct kinfo_proc *kipp)
format_time(&kipp->ki_rusage.ru_stime));
if ((procstat_opts & PS_OPT_PERTHREAD) != 0) {
- asprintf(&threadid, "%d", kipp->ki_tid);
+ asprintf(&threadid, "ID_%d", kipp->ki_tid);
if (threadid == NULL)
xo_errc(1, ENOMEM,
"Failed to allocate memory in
print_rusage()");
diff --git a/usr.bin/procstat/procstat_sigs.c
b/usr.bin/procstat/procstat_sigs.c
index 984d5d57f95..ceb36ca0dcb 100644
--- a/usr.bin/procstat/procstat_sigs.c
+++ b/usr.bin/procstat/procstat_sigs.c
@@ -155,7 +155,7 @@ procstat_threads_sigs(struct procstat *procstat,
struct kinfo_proc *kipp)
kinfo_proc_sort(kip, count);
for (i = 0; i < count; i++) {
kipp = &kip[i];
- asprintf(&threadid, "%d", kipp->ki_tid);
+ asprintf(&threadid, "ID_%d", kipp->ki_tid);
if (threadid == NULL)
xo_errc(1, ENOMEM, "Failed to allocate memory
in "
"procstat_threads_sigs()");
diff --git a/usr.bin/procstat/procstat_threads.c
b/usr.bin/procstat/procstat_threads.c
index c62bb516175..17f11044021 100644
--- a/usr.bin/procstat/procstat_threads.c
+++ b/usr.bin/procstat/procstat_threads.c
@@ -66,7 +66,7 @@ procstat_threads(struct procstat *procstat, struct
kinfo_proc *kipp)
kinfo_proc_sort(kip, count);
for (i = 0; i < count; i++) {
kipp = &kip[i];
- asprintf(&threadid, "%d", kipp->ki_tid);
+ asprintf(&threadid, "ID_%d", kipp->ki_tid);
if (threadid == NULL)
xo_errc(1, ENOMEM, "Failed to allocate memory
in "
"procstat_threads()");
It’s probably not the prettiest XML, and I’m not sure how useful the
tags are now, but arguably tags with dynamic names are a bad idea
anyway.
I think you wouldn’t see this problem with JSON, so perhaps that’s a
workaround you can consider as well.
Regards,
Kristof
More information about the freebsd-hackers
mailing list